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Preface 


Many  problems  are  naturally  expressed  in  terms  of  constraints,  for  exam¬ 
ple,  scheduling,  design,  configuration,  and  diagnosis  problems.  The  con¬ 
straint  programming  community  is  providing  the  infrastructure  for  un¬ 
derstanding  and  computing  with  constraints.  The  Second  International 
Conference  on  Principles  and  Practice  of  Constraint  Programming  brings 
together  members  of  this  community  to  share  recent  progress. 

The  conference  is  dedicated  to  Paris  KaneUakis,  who  died  in  a  tragic 
airplane  crash  in  December  of  1995.  He  was  one  of  the  founders  of  this 
conference  and  a  piUar  of  this  community.  A  KaneUakis  Prize  was  awarded 
to  the  paper  that  best  exemplifies  the  interdisciplinary  spirit  of  the  con¬ 
ference. 

Thirty-six  papers  wiU  be  presented  at  the  conference,  selected  from 
over  one  hundred  submissions.  Twenty-two  selections  wiU  be  presented 
at  a  conference  poster  session.  FuU  text  of  the  papers  and  abstracts  of 
the  posters  are  included  in  the  proceedings. 

The  conference  has  received  support  from  AppUed  Logic  Systems,  Inc. 
(ALS)  and  the  Office  of  Naval  Research  (ONR).  The  KaneUakis  Prize 
has  received  sponsorship  from  MIT  Press,  Peter  Revesz,  and  Springer- 
Verlag.  The  Conference  is  being  held  in  cooperation  with  the  American 
Association  for  Artificial  InteUigence  (AAAI),  the  Canadian  Society  for 
the  Computational  Studies  of  InteUigence  (CSCSI),  and  Tufts  Univer¬ 
sity.  The  Program  Chair  received  invaluable  assistance  from  Daniel  and 
Mihaela  Sabin. 

Updated  information  about  the  conference  wiU  be  posted  at  the  con¬ 
ference  web  site:  http://www.cs.ualberta.ca/''ai/cp96/.  The  conference 
wiU  be  held  in  Cambridge,  Massachusetts,  USA,  August  19-22,  1996. 


July  1996 


Eugene  C.  Freuder,  Program  Chair 
Durham,  New  Hampshire,  USA 


In  Memoriam:  Paris  C.  Kanellakis 


On  December  20,  1995,  Paris  C.  Kanellakis  died  unexpectedly  and  tragically, 
together  with  his  wife,  Maria-Teresa  Otoya,  and  their  children,  Alexandra  and 
Stephanos.  They  were  heading  to  Cali,  Columbia,  for  an  annual  holiday  reunion 
when  their  airplane  crashed  in  the  Andes. 

Paris  was  born  in  Greece  in  1953.  He  graduated  in  electrical  engineering 
from  the  National  Technical  University  of  Athens  in  1976;  his  undergraduate 
thesis  was  entitled  Easy-io-iest  Criteria  for  Weak  Stochastic  Stability  of  Dy¬ 
namical  Systems  and  was  supervised  by  Prof.  E.N.  Protonotarios.  In  1978,  Paris 
received  his  M.Sc.  thesis  in  electrical  engineering  and  computer  science  from 
the  Massachusetts  Institute  of  Technology.  His  M.Sc.  thesis,  Algorithms  for  a 
Scheduling  Application  of  the  Asymmetric  Travelling  Salesman  Problemj  was 
supervised  by  Profs.  R.  Rivest  and  M.  Athans.  In  1982,  he  was  awarded  his 
Ph.D.  degree  from  the  same  institution;  his  thesis  was  supervised  by  Prof.  C.H. 
Papadimitriou  and  was  entitled  On  the  Complexity  of  Concurrency  Control  for 
Distributed  Databases. 

Paris  joined  the  department  of  computer  science  at  Brown  University  as  as¬ 
sistant  professor  in  1981.  He  was  promoted  to  associate  professor  with  tenure  in 
1986,  and  to  full  professor  in  1990.  He  was  awarded  an  IBM  Faculty  Development 
Award  in  1985  and  an  Alfred  Sloan  Foundation  Fellowship  in  1987.  He  served  as 
an  associate  editor  for  the  Journal  of  Logic  Programming  and  for  the  new  jour¬ 
nal  Constraints,  as  well  as  for  Information  and  Computation,  ACM  Transactions 
on  Database  Systems,  SIAM  Journal  of  Computing,  and  Theoretical  Computer 
Science.  He  served  as  invited  speaker,  program  chair,  and  program  committee 
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member  at  many  prominent  conferences.  In  the  constraint  programming  area, 
he  was  program  chair  (together  with  J.-L.  Lassez  and  V.  Saraswat)  of  the  First 
International  Workshop  on  Principles  and  Practice  of  Constraint  Programming, 
held  in  Newport  in  April  1993.  This  workshop,  and  the  subsequent  one  held  in 
Orcas  Island  by  Alan  Doming,  were  instrumental  in  starting  the  series  of  con¬ 
straint  programming  conferences  and  Paris  played  a  critical  role  in  organizing 
the  community  and  the  conferences.  In  the  related  area  of  logic  programming, 
he  was  an  invited  speaker  at  the  6th  International  Conference  on  Logic  Pro¬ 
gramming  in  Lisbon,  where  his  talk  was  entitled  A  Logical  Query  Language  with 
Object  Identity  and  Strong  Typing.  He  was  on  the  program  committees  of  logic 
programming  conferences  in  1989,  1990,  1992,  1993,  and  1996. 

As  a  scientist,  Paris  was  a  careful  thinker,  investigating  fundamental  issues 
in  computer  science,  opening  new  technical  areas,  and  challenging  conventional 
belief  whenever  appropriate.  He  made  numerous  contributions  to  computer  sci¬ 
ence  in  areas  as  diverse  as  databases  (relational,  object-oriented,  and  constraint 
databases,  concurrency  control),  programming  languages  (lambda  calculus,  logic 
programming,  rewriting  systems,  type  inference),  distributed  computing  (con¬ 
currency  and  fault-tolerance),  complexity  theory,  and  combinatorial  optimiza¬ 
tion.  Underlying  those  contributions  was  a  unifying  theme:  the  use  of  logic, 
complexity  theory,  and  algorithmics  to  understand  the  foundations  of  practi¬ 
cal  systems,  to  analyse  their  efficiency,  and  to  improve  their  functionality.  This 
theme  was  nicely  exemplified  in  his  work  on  object-oriented  databases  featured 
at  the  logic  programming  conference  in  Lisbon.  Here  his  desire  to  understand  the 
object-oriented  database  O2  led  him  to  invent,  in  collaborative  work,  an  object- 
based  data  model,  a  new  formalization  of  object  identity,  new  programming 
tools,  and  new  indexing  algorithms. 

A  beautiful  account  of  Paris’  recent  research  accomplishments  by  S.  Abite- 
boul,  G.  Kuper,  H.  Mairson,  A.  Shvartsman,  and  M.  Vardi  appeared  in  the 
March  issue  of  ACM  Computing  Surveys.  It  was  a  major  source  of  inspiration 
for  this  short  article,  in  which  only  some  of  Paris’  contributions  to  constraint 
programming  (taken  broadly)  can  be  outlined. 

The  first  issue  of  the  Journal  of  Logic  Programming  featured  an  article  by 
C.  Dwork,  P.  Kanellakis,  and  J.  Mitchell  entitled  On  the  Sequential  Nature  of 
Unification.  The  paper  shows  that  the  decision  problem  “Do  two  terms  unify?”  is 
complete  for  PTIME  which,  informally  speaking,  means  that  unification  cannot 
be  sped  up  with  a  polynomially  bounded  number  of  processors.  This  paper  was 
published  during  a  period  of  intense  activity  on  the  parallelization  of  Prolog. 
Together  with  J.  Mitchell,  Paris  subsequently  used  the  essential  idea  behind  the 
proof  to  show  that  type  inference  in  ML  was  PSPACE-hard,  i.e.,  as  hard  as  any 
problem  that  can  be  solved  in  polynomial  space.  This  result  contradicted  the 
popular  belief  at  the  time  that  ML  typing  was  efficient.  His  subsequent  joint 
paper,  in  collaboration  with  H.  Mairson  and  J.  Mitchell,  showed  the  problem 
to  be  complete  for  EXPTIME.  Paris’  most  recent  work  on  the  lambda  calculus 
(in  collaboration  with  G.  Hillebrand  and  H.  Mairson)  led  to  a  new  syntactic 
characterization  of  the  complexity  classes,  which  emerged  from  their  research  on 
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a  functional  programming  foundation  for  a  logic-based  database  query  language. 

Together  with  G.  Kuper  and  P,  Revesz,  Paris  was  the  founder  of  the  area  of 
constraint  databases,  whose  essential  idea  is  to  replace,  in  the  relational  model, 
the  concept  of  tuples  by  a  conjunction  of  constraints.  They  investigated  the 
query  complexity  of  this  scheme  (which  parallels  in  the  database  world  the  area 
of  constraint  logic  programming)  for  various  classes  of  constraints.  Together  with 
his  colleagues  and  his  students,  he  was  also  engaged  in  a  long-term  research  to 
build  the  implementation  technology  (in  particular,  the  indexing  structures)  nec¬ 
essary  to  make  this  technology  practical.  In  particular,  together  with  D.  Goldin, 
he  investigated  constraint  query  algebras,  a  class  of  monotone  constraints  that 
allows  an  efficient  projection  algorithm,  and  similarity  queries  with  scaling  and 
shifting.  Part  of  this  work  was  featured  in  CP’95  and  in  a  journal  article  to 
appear  in  Constraints, 

Those  of  us  who  collaborated  closely  with  Paris  have  lost  not  only  an  out¬ 
standing  scientist  but  also  an  esteemed  colleague  and  a  dear  friend.  As  a  col¬ 
league,  Paris  had  the  poise,  the  personality,  and  the  energy  to  rally  communities 
behind  him  and  he  used  these  skills  to  improve  our  academic  and  professional 
environment.  We  also  mourn  a  friend  with  a  charming  and  engaging  personality 
and  a  mediterranean  passion  —and  a  family  whose  warmth  and  hospitality  will 
be  sorely  missed. 

In  writing  these  few  pages,  I  came  to  understand  one  more  time  how  fortunate 
I  was  to  collaborate  with  Paris,  to  observe  him  in  his  daily  scientific  and  family 
life,  and  to  benefit  from  his  insights,  vision,  and  broad  expertise;  and  of  course 
to  realize  how  my  life  has  changed  since  I  met  him.  He  was  a  very  special  person. 


Pascal  Van  Hentenryck 
Brown  University 
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On  Confluence  of  Constraint  Handling  Rules 
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Abstract.  We  introduce  the  notion  of  confluence  for  Constraint  Handling 
Rules  (CHR),  a  powerful  language  for  writing  constraint  solvers.  With  CHR 
one  simplifies  and  solves  constraints  by  applying  rules.  Confluence  guarantees 
that  a  CHR  program  will  always  compute  the  same  result  for  a  given  set  of 
constraints  independent  of  which  rules  are  applied.  We  give  a  decidable, 
sufficient  and  necessary  syntactic  condition  for  confluence. 

Confluence  turns  out  to  be  an  essential  syntactical  property  of  CHR  programs 
for  two  reasons.  First,  confluence  implies  correctness  (as  will  be  shown  in  this 
paper).  In  a  correct  CHR  program,  application  of  CHR  rules  preserves  logical 
equivalence  of  the  simplified  constraints.  Secondly,  even  when  the  program  is 
already  correct,  confluence  is  highly  desirable.  Otherwise,  given  some  cons¬ 
traints,  one  computation  may  detect  their  inconsistency  while  another  one 
may  just  simplify  them  into  a  still  complex  constraint. 

As  a  side-eflPect,  the  paper  also  gives  soundness  and  completeness  results  for 
CHR  programs.  Due  to  their  special  nature,  and  in  particular  correctness, 
these  theorems  are  stronger  than  what  holds  for  the  related  families  of  (con¬ 
current)  constraint  programming  languages. 

Keywords:  constraint  reasoning,  semantics  of  programming  languages, 
committed-choice  languages,  confluence  and  determinacy. 


1  Introduction 

Constraint  Handling  Rules  (CHR)  [Pru95]  have  been  designed  as  a  special-purpose 
language  for  writing  constraint  solvers.  A  constraint  solver  stores  and  simplifies 
incoming  constraints.  CHR  is  essentially  a  committed-choice  language  consisting  of 
guarded  rules  that  rewrite  constraints  into  simpler  ones  until  they  are  solved. 

In  contrast  to  the  family  of  the  general-purpose  concurrent  constraint  languages 
(CC)  [Sar93]  and  the  ALPS^  [Mah87]  framework,  CHR  allow  “multiple  heads”,  i.e. 
conjunctions  of  atoms  in  the  head  of  a  rule.  Multiple  heads  are  a  feature  that  is 
essential  in  solving  conjunctions  of  constraints.  With  single-headed  CHR  rules  alone, 
unsatisfiability  of  constraints  could  not  always  be  detected  (e.g  X<Y,Y<X)  and  global 
constraint  satisfaction  could  not  be  achieved. 

Nondeterminacy  in  CHR  arises  when  two  or  more  rules  can  fire.  It  is  obviously 
desirable  that  the  result  of  a  computation  in  a  solver  will  always  be  the  same,  seman¬ 
tically  and  syntactically,  no  matter  in  which  CHR  rules  are  applied.  This  property 
of  constraint  solvers  will  be  called  confluence  and  investigated  in  this  paper. 


^  Saxaswat  showed  in  [Sar93],  that  ALPS  can  be  recognized  as  a  subset  of  cc(|,  — +) 
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We  will  introduce  a  decidable,  sufficient  and  necessary  syntactic  condition  for 
confluence.  This  condition  adopts  the  notion  of  critical  pairs  as  known  from  term 
rewrite  systems  [DOS88,  KK91,  Pla93].  Monotonicity  of  constraint  store  updates, 
an  inherent  property  of  constraint  logic  programming  languages,  plays  a  central  role 
in  proving  that  joinability  of  critical  pairs  is  sufficient  for  local  confluence. 

Confluence  turns  out  to  be  important  with  regard  to  both  theoretical  and  practi¬ 
cal  aspects:  We  show  that  confluence  implies  correctness  of  a  program.  By  correctness 
we  mean  that  the  declarative  semantic  of  a  CHR  program  is  a  consistent  theory.  Un¬ 
like  CC  programs,  CHR  programs  can  be  given  a  declarative  semantics  since  they 
are  only  concerned  with  defining  constraints  (i.e.  first  order  predicates),  not  proce¬ 
dures  in  their  generality.  Furthermore  we  show  how  to  strengthen  the  declarative 
reading  of  a  CHR  program  if  it  is  confluent.  A  practical  application  of  our  definition 
of  confluence  lies  in  program  analysis,  where  we  can  identify  non-confluent  parts  of 
CHR  programs  by  examining  the  critical  pairs.  Programs  with  non-confluent  parts 
essentially  represent  an  ill-defined  constraint  solving  algorithm. 

Our  work  extends  previous  approaches  to  the  notion  of  determineicy  in  the  field  of 
CC  languages:  Maher  investigates  in  [Mah87]  a  class  of  flat  committed  choice  logic 
languages  (ALPS).  He  defines  the  class  of  deterministic  ALPS  programs  as  those 
programs  whose  guards  are  mutually  exclusive.  The  class  of  deterministic  ALPS 
programs  is  less  expressive  than  confluent  CHR  programs.  Saraswat  defines  for  the 
CC  framework  a  similar  notion  of  determinacy  [Sar93],  which  is  also  more  restrictive 
than  confluence.  We  also  give  two  reasons,  why  CHR  cannot  be  made  deterministic 
in  general. 

Our  approach  is  orthogonal  to  the  work  in  program  analysis  in  [M095]  and 
[FGMP95],  where  a  different,  less  rigid  notion  of  confluence  is  defined:  A  CC  program 
is  confluent,  if  different  process  schedulings  (i.e.  different  orderings  of  decisions  at 
nondeterministic  choice  points)  give  rise  to  the  same  set  of  possible  outcomes.  The 
idea  of  [M095]  is  to  introduce  a  non-standard  semantics,  which  is  confluent  for  ail 
CC  programs. 

The  paper  is  organized  as  follows.  The  next  section  introduces  the  S3nQtax  of  cons¬ 
traint  handling  rules,  their  declarative  and  operational  semantics.  Then  this  section 
contributes  to  the  relationship  between  the  declarative  and  operational  semantics  of 
CHR  programs  by  giving  soundness  and  completeness  results.  Section  3  presents  the 
notion  of  confluence  for  CHR.  In  section  4  we  show  that  confluence  implies  logical 
correctness  of  a  program.  This  leads  to  a  stronger  completeness  and  soundness  result 
for  finite  failed  computation.  Finally,  we  conclude  with  a  summary  and  directions 
for  future  work. 


2  Syntax  and  Semantics  of  CHR 

We  assume  some  familiarity  with  (concurrent)  constraint  programming  (CCP)  [JL87, 
JM94,  SRP91,  Sar93,  Sha89].  There  is  a  distinguished  class  of  predicates,  the  cons¬ 
traints.  We  assume,  that  there  is  a  built-in  constraint  solver  that  solves,  checks  and 
simplifies  built-in  (predefined)  constraints.  On  the  other  hand,  the  user-defined  cons¬ 
traints  are  those  defined  by  a  CHR  program.  This  implies,  that  we  have  two  disjoint 
sets  of  constraint  symbols  for  the  built-in  and  the  user-defined  constraints. 
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As  a  special  purpose  language,  CHR  usually  extend  a  host  language  such  as  Prolog 
or  Lisp  with  (more)  constraint  solving  capabilities.  This  also  means,  that  auxiliary 
computations  in  CHR  programs  can  be  performed  in  the  host  language.  Without  loss 
of  generality,  to  keep  this  paper  self-contained,  we  will  not  address  host  language 
issues  here.  We  also  restrict  ourselves  to  the  main  kind  of  CHR  rule. 

Definition  1.  A  CHR  program  is  a  finite  set  of  simplification  rules^.  A  simplification 
rule  is  of  the  form 


Hi,...  ,Hi  ^  Gi,...,Gj  I  Bi,...,Bk. 

where  the  multi-head  Hi,..., Hi  is  a  conjunction^  of  user-defined  constraints  and 
the  guard  Gi , . . . ,  Gj  is  a  conjunction  of  built-in  constraints  and  the  body  Bi,...,Bk 
is  a  conjunction  of  built-in  and  user-defined  constraints  called  goals. 


2.1  Declarative  Semantics 

Unlike  CC  programs,  CHR  programs  can  be  a  given  a  declarative  semantics  since 
they  are  only  concerned  with  defining  constraints  (i.e.  first  order  predicates),  not 
procedures  in  their  generality. 

Declaratively,  a  simplification  rule 

Hi,..., Hi  ^  Gi,...,Gj  I  Bi,...,Bk. 

is  a  logical  equivalence  provided  the  guard  is  true  in  the  current  context 

V5  (3y  (Gi  A  ...  A  Gj))  —*■  {Hi  A  ...  A  Hn  ^  Sz  {Bi  A  ...  A  Bjb)), 

where  are  the  variables  occuring  in  Hi,. . .  ,Hn  and  y,  z  are  the  other  variables 
occuring  in  Gi , . . . ,  Gj  and  Bi,...Bk  respectively. 

The  declarative  interpretation  of  a  CHR  program  P  is  given  by  the  set  V  of  logical 
equivalences  and  a  consistent  built-in  theory  CT  which  determines  the  meaning  of 
the  built-in  constraints  appearing  in  the  program.  The  constraint  theory  CT  specifies 
among  other  things  the  ACI  properties  of  the  logical  conjunction  A  in  the  built-in 
constraint  store,  the  properties  of  the  equality  constraints  =  (Clarks  axiomatization) 
and  the  properties  of  the  basic  constraints  true  and  false. 

Definition  2.  A  CHR  program  P  is  correct  iff  "P  U  CT  is  consistent. 


2.2  Operational  Semantics  of  CHR 

We  define  the  operational  semantics  as  a  transition  system. 

^  There  are  two  other  kinds  of  rules  [BFL'^94],  which  are  not  treated  here. 
®  For  conjunctions  in  rules  we  use  instead  of  ”A”. 

^  we  use  X  as  an  abbreviation  for  a  sequence  of  variables 
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States 

Definition  3.  A  state  is  a  triple 


<Cu,Cb,V>- 

Cu  is  a  conjunction  of  both  user-defined  and  built-in  constraints  that  remains  to  be 
solved.  Cb  is  a  conjunction  of  built-in  constraints  accumulated  up  to  this  point  of 
execution.  V  is  an  ordered  set  of  variables. 

Definition  4.  A  variable  AT  in  a  state  <Cu^Cb,  V>  is  called  global^  if  it  appears  in 
V.  It  is  called  local  otherwise. 

Definition  5.  The  pair  (Ci,  C2)  (Ci  and  C2  are  conjunctions  of  constraints)  is  called 
enclosed  by  the  ordered  set  V  iff  all  variables  shared  by  Ci  and  C2  are  contained  in 
V. 


We  can  attribute  to  each  state  <Cu,Cb->V>  the  formula 

3^1,.. .,r^  CuACb 

as  a  logical  meaning,  where  Ti, . . .  ,Fm  are  the  local  variables  in  Cu  and  Cs.  Note 
that  the  global  variables  remain  unbound  in  the  formula. 


Update  We  define  now  the  basic  operation  of  the  built-in  constraint  solver;  The 
main  task  of  update  is  transforming  a  state  into  a  logically  equivalent  state  with  a 
normalized  buOt-in  constraint  store,  update  performs  the  following  tasks: 

-  normalize  the  built-in  constraint  store  according  to  CT 
~  propagate  equality  constraints  through  the  state 

—  remove  redundant  equality  constraints  where  one  side  is  a  local  variable. 

Definition  6.  update  normalizes  a  state  by  performing  the  following  operations  in 
sequence: 

1.  update  produces  a  unique  representation  of  the  built-in  constraint  store  accor¬ 
ding  to  the  theory  CT. 

2.  Equality  constraints  of  the  form  receive  a  special  treatment:  occurrences  of 
X  in  all  constraints  (except  the  equality  itself)  in  the  built-in  constraint  store 
and  goal  store  are  replaced  by  t, 

3.  All  equality  constraints  of  the  form  X=t  or  Y^X  are  removed,  if  X  is  local. 
These  equality  constraints  will  be  called  local  This  refiects  the  validity  of  for¬ 
mulas  {3X  X=a),  which  follows  from  the  axioms  in  CT  (see  example  2.1). 

Example  2.1 

update(<p(y)  A  g(Z),y=/(X)  A  Z=a,  [Y]>)  =  <p{f{X))  A  q{a),Y=f{X),  [Y]> 

Under  an  enclosement  condition  update  is  compatible  with  addition  of  cons¬ 
traints.  This  result  is  given  by  the  following  lemma,  which  is  proven  by  contradic¬ 
tion. 
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Lemma?.  If  C  is  a  conjunction  of  built-in  constraints  and  (C',C'b)  is  enclosed  by 
V  and  update(<C:;,C5,  V>)  =  <C{;,C5,V>  then 

update(^C^t/  >  C7b  A  “  update(^Cy ,  C g  t\  V^). 

The  enclosement  condition  in  the  lemma  above  reflects  the  sensitivity  of  update 
with  respect  to  local  variables.  It  guarantees  that  equality  constraints  involving 
variables  appearing  in  the  added  constraint  C  are  not  removed  due  to  locality.  If  the 
condition  is  violated,  the  claim  is  false: 

Example  2.2 


update(<true,  A’=2,  D>)  =  <true,true,  0>, 
adding  the  built-in  constraint  X^\  on  both  sides  results  for  the  left  side  in: 

update(<true,  X=2  A  []>)  =  <true,  false,  Q> 
but  for  the  right  side  in: 

update(<true,  true  A  X=l,  [)>)  =  <true,  true,  []> 

Definitions.  Entailment  (-♦o)  tests  whether  a  given  conjunction  of  built-in  cons¬ 
traints  is  implied  by  another  conjunction  of  built-in  constraints  in  the  context  of  a 
state  and  is  defined  as  follows: 

<Cui,Cbi^V>  -+0  <Cu2,  Cb2,  V>  iff 

,  V>  =  update(<C[;2i  C'bi  a  Cb2i  V>). 

where  update(<Ci/i,C5i,  V>)  =  C'bx  ,V>  and  update(<C't;2,  Cb2,  V>)  = 

Computation  Steps  Given  a  CHR  program  P  we  define  the  transition  relation 
by  introducing  two  kinds  of  computation  steps: 

Solve  <C  ACujCb,V>  *-^p  update(<C't/,C' A  Cb,  V>) 
if  C  is  a  built-in  constraint. 

The  built-in  constraint  solver  updates  the  state  after  adding  the  built-in  constraint 
C  to  the  built-in  store  Cb- 

Simplify  <H'  A  Cc^,  Cb,  V>  update(<CB  A  B,  H^H'  A  Cb,  V>) 

if  (B”  '»■  <7  I  B)  is  a  variant  with  fresh  variables  of  a  rule  in  P  and 
<B',  Cb,  V>-^o<B',  B'=B  a  C,  V>. 

To  simplify  user-defined  atoms  means  to  apply  a  simplification  rule  on  these  atoms. 
This  can  be  done  if  the  atoms  match  with  the  head  atoms  of  the  rule  and  the  guard 
is  entailed  by  the  built-in  constraint  store.  The  atoms  occuring  in  the  body  of  the 
rule  are  added  to  the  goal  constraint  store. 

Notation.  By  c(ti, . . .  ,fn)=c(si, . . . ,  Sn)  we  mean  ti=Si  A  ...  A  tn^Sn,  if  c  is  a 
user-defined  constraint.  By  pi  A ...  A  Pn=9i  A ...  A  we  mean  pi=gi  A  ...  A  pn=9n- 
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Definition  9.  5  *-+p  5'  holds  iff 

S  =  S'  OT  S  —  update(5^)  or  5  5i  ...  5n  ^-*p  S'  {n  >  0). 

We  will  write  »->•  instead  of  t-^p  and  ►->*  instead  of  »-+p,  if  the  program  P  is  fixed. 
Lemma  10.  Update  has  no  influence  on  application  of  rules,  i.e. 

S  S'  implies  update(5)  S'. 

The  initial  state  consists  of  a  goal  G,  an  empty  built-in  constraint  store  and  the 
list  V  of  the  variables  occuring  in  G, 

<G,  true,  V>. 

A  computation  state  is  a  final  state  if 

“  its  built-in  constraint  store  is  false,  then  it  is  called  failed; 

—  no  computation  step  can  be  applied  and  its  built-in  constraint  store  is  not  false. 
Then  it  is  called  successful 

Definition  11.  A  computation  of  a  goal  G  is  a  sequence  Sq,Si,...  of  states  with 
Si  »-+  Si+i  beginning  with  the  the  initial  state  Sq  =  <G,  true,  V>  and  ending  in 
a  final  state  or  diverging.  A  finite  computation  is  successful  if  the  final  state  is 
successful.  It  is  failed  otherwise. 

Definition  12.  A  computable  constraint  G  of  G  is  the  conjunction  3x  Cu  A  Gp, 
where  Cu  and  Cb  occur  in  a  state  <Gt7,Gp,V>,  which  appears  in  a  computation 
of  G.  X  are  the  local  variables. 

A  final  constraint  C  is  the  conjunction  3x  Cu  A  Gp,  where  Cu  and  Gp  occur  in  a 
final  state  <Gt7,Gp,V>. 


Equivalence  and  Monotonicity  The  following  definition  reflects  the  ACl  pro¬ 
perties  of  the  goal  store  and  the  fact  that  all  states  with  an  inconsistent  built-in 
constraint  store  are  identified. 

Definition  13.  We  identify  states  according  to  the  equivalence  relation  =: 

<Gi7,Gp,V>  ^  <G[;,Gp,V>  iff  Cu  can  be  transformed  to  G^  using  the  ACl 
properties  of  the  conjunction  A,  or  Gp  is  false. 

We  have  to  ensure  that  the  equivalence  ^  is  well-defined,  i.e.  that  it  is  compatible 
with  the  operations  we  perform  on  states.  We  have  six  different  operations  working 
on  states,  1-3  are  explicitly  used  for  computation  steps,  whereas  4-6  occur  only  in 
the  proof  for  the  theorem  on  local  confluence: 

1.  Solve 

2.  Simplify 

3.  update 

4.  add  a  constraint  to  the  goal  store  or  built-in  constraint  store 

5.  form  a  varicint 

6.  replace  the  global  variable  store  by  another  ordered  set  of  variables 
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It  is  easy  to  see  that  all  these  operations  are  congruent  with  the  relation  =,  i.e.  the 
following  holds  for  each  instance  o  of  an  operation: 

Si  ^  S2  implies  o(5i)  ^  0(52) 

Therefore  we  can  reason  about  states  modulo  =. 

The  next  definition  defines  the  notion  of  monotonicity,  which  guarantees  that 
addition  of  new  built-in  constraints  does  not  inhibit  entailment  (and  hence  the  ap¬ 
plication  of  Simplify): 

Definition  14.  A  built-in  constraint  solver  is  said  to  be  monotonic  iff  the  following 
holds: 

<C[/i,CByV>-^o<Oi/2,G,V>  implies  <Ci/i,Cb  A  C,V>--^o<Ou2yG,V>, 

Lemma  15.  Every  built-in  constraint  solver  (where  update  fulfills  the  stated  requi¬ 
rements)  is  monotonic. 


2.3  Relation  between  the  declarative  and  the  operational  semantics 

We  present  results  relating  the  operational  and  declarative  semantics  of  CHR.  These 
results  are  based  on  work  of  Jaffar  and  Lassez  [JL87],  Maher  [Mah87]  and  van 
Hentenryck  [vH91]. 

Lemma  16.  Let  P  be  a  CHR  program,  G  be  a  goal.  If  C  is  a  computable  constraint 
of  G,  then 

P,Crt=V(C^G).5 

Proof,  By  induction  over  the  number  of  computation  steps. 

Theorem  17  Soundness  of  successful  computations.  Let  P  be  a  CHR  program 
and  G  be  a  goal.  If  G  has  a  successful  computation  with  final  constraint  C  then 

P,CTl=V(C^G). 

Proof.  Immediately  from  lemma  16. 

The  following  theorem  is  stronger  than  the  completeness  result  presented  in 
[Mah87],  in  the  way  that  we  can  reduce  the  disjunction  in  the  strong  completeness 
theorem  to  a  single  disjunct.  This  is  possible,  since  the  computation  steps  preserve 
logical  equivalence  (lemma  16). 

Theorem  18  Completeness  of  successful  computations.  Let  P  be  a  CHR  pro¬ 
gram  and  G  be  a  goal.  If  P,  CT  f=  V  (G  G)  and  C  is  satisfiable,  then  G  has  a 
successful  computation  with  final  constraint  C'  such  that 

P,Gr|=V(G^G'). 

®  VP  is  the  universal  closure  of  a  formula  P. 
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The  next  theorem  gives  a  soundness  and  completeness  result  for  correct  CHR 
programs. 

Theorem  19  Soundness  and  Completeness  of  failed  computations. 

Let  P  be  a  correct  CHR  program  and  G  be  a  Goal,  The  following  are  equivalent: 

a)  V,CT\^~^3G 

b)  G  has  a  finitely  failed  computation. 

3  Confluence  of  CHR  programs 

We  extend  the  notion  of  determinacy  as  used  by  Maher  in  [Mah87]  and  Saraswat  in 
[Sar93]  to  CHR  by  introducing  the  notion  of  confluence.  The  notion  of  deterministic 
programs  is  less  expressive  and  too  strict  for  the  CHR  formalism^  because  it  is  not 
always  possible  to  transform  a  CHR  program  into  a  deterministic  one.  This  has  two 
reasons,  of  which  the  first  also  holds  for  the  CC  formalism: 

The  constraint  system  must  be  closed  under  negation  so  that  a  single- headed 
CHR  program  can  be  transformed  into  one  with  non-overlapping  guards. 

Example  1.  We  want  to  extend  the  built-in  solver,  which  contains  the  built-in  cons¬ 
traints  <  and  =,  with  a  user-defined  constraint  maximujn(X,Y,Z)  which  holds  if  Z 
is  the  maximum  of  X  and  Y.  The  following  could  be  part  of  a  definition  for  the 
constraint  maximum: 

maximum(X,Y,Z)  ^X<Y  1  Z=Y. 
maximum(Xl,Yl,Zl)  <4^  Y1<X1  I  Z1=X1. 

This  program  cannot  be  transformed  into  an  equivalent  one  without  overlapping 
guards. 

The  second  reason  is  that  CHR  rules  have  multiple  heads.  We  can  get  into  a 
situation,  where  two  rules  can  be  applied  to  different  but  overlapping  conjunctions 
of  constraints.  In  general  it  is  not  possible  to  avoid  commitment  of  one  of  the  rules 
(and  thus  making  the  program  deterministic®)  by  adding  constraints  to  the  guards. 

Example  2.  Consider  the  following  part  of  a  CHR  program  defining  interactions  bet¬ 
ween  the  boolean  operations  not,  imp  and  or. 

not(X,Y),  imp(X,Y)  true  I  X=0,  Y=l. 
not(Xl,Yl),  or(Xl,Zl,Yl)  ^true  |  X1=0,  Yl=l,  Zl=l. 

Note  that  both  rules  can  be  applied  to  the  goal  not(A,B)Aimp(A,B)Aor(A,C,B). 
When  we  want  that  only  the  fist  rule  can  be  applied,  we  have  to  add  a  constraint 
to  the  guard  of  the  first  rule,  that  or(A,C,B)  doesn’t  exist.  Such  a  condition  is 
meta-logical  and  syntactically  not  allowed. 

®  We  extend  the  notion  of  deterministic  programs  to  our  formalism  in  the  natural  way  that 
only  one  rule  can  commit  by  any  given  goal. 


In  the  following  we  will  adopt  and  extend  the  terminology  and  techniques  of 
conditional  term  rewriting  systems  (CTRS)  [DOS88].  A  straightforward  translation 
of  results  in  the  field  of  CTRS  was  not  possible,  because  the  CHR  formalism  gives 
rise  to  phenomena  not  appearing  in  CTRS.  These  include  the  existence  of  global 
knowledge  (the  built-in  constraint  store)  and  local  variables. 

Definition  20.  A  CHR  program  is  called  terminating^  if  there  are  no  infinite  com¬ 
putation  sequences. 

Definition  21.  Two  states  S\  and  S2  axe  called  joinable  if  there  exist  states  5^ 
such  that  Si  S{  and  82^*82  and  S{  is  a  variant  of  5^  ^ 

Definition  22.  A  CHR  program  is  called  confluent  if  the  following  holds  for  all  states 
5, 81,82: 

If  8  Si,  8  «->*  82  then  Si  and  82  are  joinable. 

Definition  23.  A  CHR  program  is  called  locally  confluent  if  the  following  holds  for 
all  states  8,81,82: 

li  8  Si,  8  ^  82  then  Si  and  82  are  joinable. 

For  the  following  reasoning  we  require,  that  rules  of  a  CHR  program  contain 
disjoint  sets  of  variables.  This  requirement  means  no  loss  of  generality,  because  every 
CHR  program  can  be  easily  transformed  into  one  with  disjoint  sets  of  variables. 

In  order  to  give  a  characterization  for  local  confluence  we  have  to  introduce  the 
notion  of  critical  pairs: 

Definition  24.  If  one  or  more  atoms  Hi^, . . .  ,Hi^  of  the  head  of  a  CHR  rule 
Hi,, . ,  ,Hn  G  \  B  unify  with  one  or  more  atoms  atom  Hj^  of  the  head 

of  another  or  the  same  CHR  rule  H{,.,  G‘  \B’  then  the  triple 

is  called  a  critical  pair  of  the  two  CHR  rules,  {n , . . . ,  in}  and  {ji,...,  jm  }  are  permu¬ 
tations  of  {1, ...  ,n}  and  {1, . . .  ,7n}  respectively,  V  is  the  set  of  variables  appearing 
in^l,...,^n,^i,...,^^n- 

Examples.  Consider  example  1.  There  are  two  trivial^  and  the  following  nontrivial 
critical  pair: 

(X<Y  A  Y1<X1  A  X=X1  A  Y=Y1  A  Z=Z1  I 
Z^Y  =i=  Z1=X1  I  CX,Y,Z,X1.Y1,Z1]) 

The  rules  of  example  2  have  the  nontrivial  critical  pair  (We  omit  the  global 
variable  store  for  reasons  of  clarity): 

(X=X1  A  Y=Y1  I 

imp(X,Y)  A  X1=0  A  Yl=l  A  Zl=l  =1=  or(Xl,Zl,Yl)  A  X=0  A  Y=1  I  [..]) 

^  We  call  critical  pairs  of  the  form  (G  |  B  =1=  B  |  V)  trivial. 
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Trivial  critical  pairs  in  example  1  are  stemming  from  unifying  the  heads  of  either 
the  first  or  second  rule  with  themselves.  Note  that  not  every  critical  pair  stemming 
from  one  rule  only  is  trivial.  If  the  head  of  a  rule  contains  a  constraint  symbol  more 
than  once,  the  resulting  critical  pair  may  be  nontrivial. 

Definition  25.  A  critical  pair  {G  \  B\  —[=  B2  \  V)  is  called  joinable  if  <Bi,  G,  V> 
and  <B2^G^V>  are  joinable. 

Example  4-  The  first  critical  pair  in  example  3  is  joinable,  if  the  built-in  constraint 
solver  simplifies  X<Y  A  Y<X  to  the  constraint  X=Y. 

The  following  lemmas  are  necessary  to  prove  theorem  33.  The  proofs  for  these 
lemmas  can  be  found  in  [AFM96].  The  first  lemma  states  that  the  global  variables 
are  not  touched  when  testing  the  variance  of  two  states.  Crucial  for  this  lemma  is 
the  fact  that  V  is  an  ordered  set. 

Lemma  26.  If 

<Cui,Cbi,V>  ~  <Cu2^Cb2,V> 
then  the  variables  in  V  are  not  modified  by  variable  renaming. 

The  following  lemma  shows  that  enclosement  guarantees  that  addition  of  built-in 
constraints  is  compatible  with  update: 

Lemma 27.  If  {C,Cu  ACb)  is  enclosed  by  V  and 

^G[7,Gjg,V^  I— then 
<Cu^  Cb  a  C,  V>  update(<C7{7,  A  C,  V>). 

We  apply  lemma  27  to  prove  lemma  28,  stating  the  enclosement  conditions  under 
which  joinability  of  states  is  compatible  with  addition  of  built-in  constraints. 

Lemma  28.  If 

Csi,  V>  t-4*  <Ci;i^Cqi^V>^ 

and  (C, Cui  A  Cbi)  and  (C, Ct72  A  Cb2)  are  enclosed  by  V,  then 

a) 

<Cui,Cbi  a  C,  V>  update(<Cyi ,  C^,AC,V», 

<Cu2,Cb2  a  C,V>  up<iate(<C[;2,  C^2AC,V», 

update(<C'{;i,C5i  A  C,  V>)  update(<C[;2,C52  A  C,  V>), 

b) 

<Gt;i  A  G,  Cbi » update(<G[7i  A  G,  G^j^ ,  V>), 

<Cu2  A  G,G52>  V>  «-»•*  update(<Gi;2  A  C,C'b„V», 
update(’<Gjyj  AG,Gg2,V>)  update(<G[/2  A  G,G'g25 
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Definition  29.  We  call  two  states  <CLri,CBi,  V>  and  <Cu2,Cb2^V>  update  equi^ 
valent  iff 

update(<Ct;a,CBi,  V>)  =  update(<C't/2,CB2»  V>) 

Lemma 30.  If  <Cc;,Cb,V>  and  <C[j^Cq^V>  are  update  equivalent  and 
<Cc;,C'b,V>  h-.*  S\  then  <C{;,C^,V>  h-.*  S\ 

Proof.  The  lemma  follows  directly  from  lemma  10. 

The  next  lemma  gives  a  condition  when  joinability  is  compatible  with  changing 
the  global  variable  store; 

Lemma 31.  Let  <Ct7i,Csi,V>  and  <Cu2->Cb2,V>  be  joinable.  Then  the  follo¬ 
wing  holds: 

a)  <Cui ,  Cbi  ,  V'>  and  <Cu2^  C'b2,  V'>  are  joinable, 
if  V'  consists  only  of  variables  contained  in  V. 

b)  <Cui ,  Cbi  ,  V  o  V'>  and  <Cu2,  Cb2,  V  o  V'>  are  joinable, 

if  V'  contains  only  fresh  variables  (o  denotes  concatenation). 

The  following  theorem  is  an  analogy  to  Newman’s  Lemma  for  term  rewriting 
systems  [Pla93]  and  is  proven  analogously: 

Theorem  32  confluence  of  CHR  programs.  If  a  CHR  program  is  locally  conflu¬ 
ent  and  terminating,  it  is  confluent. 

Theorem  33  gives  a  characterization  for  locally  confluent  CHR  programs.  The 
proof  is  given  in  [AFM96]  and  relies  on  lemmas  26  to  31. 

Theorem  33  local  confluence  of  CHR  programs.  A  terminating  CHR  program 
is  locally  confluent  if  and  only  if  all  its  critical  pairs  are  joinable. 

The  theorem  also  means  that  we  can  decide  whether  a  program  (which  we  do  not 
know  is  terminating  or  not)  will  be  confluent  in  case  it  is  terminating. 

Example  5.  This  example  illustrates  the  case  that  an  unjoinable  critical  pair  is  de¬ 
tected.  The  following  CHR  program  is  an  implementation  of  merge/3,  i.e.  merging 
two  lists  into  one  list  as  the  elements  of  the  input  lists  arrive.  Thus  the  order  of 
elements  in  the  flnal  list  can  differ  from  computation  to  computation. 

merge(  []  ,L2,L3)  true  I  L2=L3. 
merge  (Ml,  [],  M3)  true  |  M1=M3. 

merge (CX I Nl]  ,N2,N3)  true  I  N3=CX|N],  merge (Nl, N2, N)  . 
merge  (01,  [Y 1 02],  03) true  I  03=[Y|0],  merge  (01, 02, 0)  . 

There  are  8  critical  pairs,  4  of  them  stemming  from  different  rules. 

If  merge/3  meets  the  specification,  there  is  space  for  nondeterminism  that  causes 
non-confluence.  Indeed,  a  look  at  the  critical  pairs  reveals  one  critical  pair  stemming 
from  the  third  and  fourth  rule  that  is  not  joinable: 
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([X INI]  =01  A  N2=[Y!02]  A  N3=03  I 

N3=[X!N]  A  merge(Nl,N2,N)  03=CY|0]  A  merge (01, 02,0)  (  [..]) 

It  can  be  seen  from  the  unjoinable  critical  pair  above  that  a  state  like 
<merge([a]  ,  [b]  ,L)  ,true,  CL]>  can  either  result  in  putting  a  before  b  in  the  out¬ 
put  list  L  or  vice  versa,  since  a  Simplify-step  can  result  in  differing  unjoinable 
states,  depending  on  which  rule  is  applied.  Hence  -  not  surprisingly  -  merge/3  is  not 
confluent. 

4  Correctness  and  Confluence  of  CHR  Programs 

Definition  34.  Given  a  CHR  program  P,  we  define  the  computation  equivalence 
<-^p:  5i  52  iff  Si  S2  or  5i<-h  52-  5  S'  iff  there  is  a  sequence  5i, . . .  5^ 
such  that  Si  is  5,  Sn  is  S'  and  5i  Si+i  for  all  i.  We  will  write  instead  of  <-^p 
and  +-4*  instead  of  if  the  program  P  is  fixed. 

For  the  sake  of  simplicity  and  clarity  we  prove  the  following  two  lemmas  only 
for  the  special  case  that  all  rules  are  ground-instantiated,  without  guards  and  that 
true  and  false  are  the  only  built-in  constraints  used.  One  can  extend  the  proof  to 
full  CHR  by  transforming  each  rule  of  a  CHR  program  into  (possibly  infinitely  many) 
ground-instantiated  rules.  This  includes  evaluating  the  built-in  constraints  in  the 
guards  and  bodies. 

Lemma  35.  If  P  is  confluent,  then  <true,  true,  V>  <-^*p  <true,  false,  V>  does  not 
hold. 

Proof.  We  show  by  induction  on  n  that  there  are  no  states  5i,Ti,  52, .  •  • ,  Pn-i,  5n 
such  that 

<true,  true,  V>  >  5i  >—>*  Ti  — t  S2  Tn— 1  Sn  <true,  false,  V> 

Base  case:  <true,  true,  V>  5i  •-+*  <true,  false,  V>  cannot  exist,  because 
<true,  true,  V>  and  <true,  false,  V>  are  different  (no  variants)  final  states  and  P 
is  confluent. 

Induction  step:  We  assume  that  the  induction  hypothesis  holds  for  n,  i.e. 
<!true,  true,  V>  ** — i5i  1 — Ti  *•< — <52  ...  Tn— 1  — ’  5n  <Ctrue,  false,  V> 

doesn’t  exist.  We  prove  the  assertion  for  n  -f- 1  by  contradiction: 

We  assume  that  a  sequence  of  the  form  <true,  true,  V>  *<-«  5i  Ti  52 
T2**-^  ...  *^5n  »->•*  Tn  *^5n+i  »“>*  <true, falsc, V>  exists.  We  will  lead  this  as¬ 
sumption  to  a  contradiction. 

Since  P  is  confluent,  <true,  false,  V>  and  Tn  are  joinable.  Since  <true,  false,  V> 
is  a  flnal  state,  there  is  a  computation  of  Tn  that  results  in  <true,  false,  V>  {Tn 
<true,  false,  V>),  and  hence  Sn  <true,  false,  V>.  Therefore  there  is  a  sequence 
of  the  form 

<true,  true,  V>  **—*  Si  Ti  *■<— » S2  T2  •  ■  •  ** — '  5n— 1  Tn—i  Sn 

<true,  false,  V>, 

which  is  a  contradiction  to  the  induction  hypothesis. 
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Lemma  36.  If  <true,  true,  V>  <true,  false,  V>  does  not  hold,  then  VUCT  is 
consistent. 

Proof.  We  show  consistency  by  defining  an  interpretation  which  is  a  model  of  V, 
and  therefore  of  7^  U  CT. 

We  define  Iq  :=  {{Ci, ...  jCnlKCi  A  ...  A  Cn, true,  V>  <true,true,  V>}. 

Let  be  I  :=  (U/o)\{true}  (U*^  is  the  union  of  all  members  of  M).  false  ^  /,  be¬ 
cause  <false,  true,  V>  <true,  true,  V>  does  not  hold.  Therefore  /  is  a  Herbrand 
interpretation. 

We  show  that  I 

For  all  formulas  following  equivalences  hold: 


I  \=  Hi  A...  A  Hn 

iff  <Hi  A  ...  A  Hn,  true,  V>  <true,  true,  V> 
iff  <Bi  A...  A  Bm,  true,  V>  -m-*  <true,  true,  V> 
iff{Bi,...,B,„}C7 
iff  /  t=  J3i  A  . . .  A  Bm> 

Therefore  I  |=  ffi  A  . . .  A  Bi  A  . . .  A  for  all  formulas  Hi  A  ...  A  Hn  ^ 

A  . . .  A  Bm  in  V. 

Theorem  37.  If  P  is  confluent,  then  V  U  CT  is  consistent. 

Proof.  The  theorem  follows  directly  from  the  lemmas  35  and  36. 

Maher  proves  the  following  result  for  deterministic  programs:  if  any  computation 
sequence  terminates  in  failure,  then  every  (fair)  computation  sequence  terminates  in 
failure.  We  extend  this  result  on  confluent  programs  and  give,  compared  to  theorem 
19,  a  closer  relation  between  the  operational  and  declarative  semantics. 

Definition  38.  A  computation  is  fair  iff  the  following  holds: 

If  a  rule  can  be  applied  infinitely  often  to  a  goal,  then  it  is  applied  at  least  once. 

Lemma  39.  Let  F  be  a  confluent  CHR  program  and  C?  be  a  goal  which  has  a  finitely 
failed  derivation.  Then  every  fair  derivation  of  G  is  finitely  failed. 

The  following  theorem  is  a  consequence  of  the  above  lemma  and  theorem  19. 

Theorem  40.  Let  F  be  a  confluent  program  and  G  be  a  Goal. 

The  following  are  equivalent: 

a)  F,C'rt=-.3G 

b)  G  has  a  finitely  failed  computation. 

c)  every  fair  computation  of  G  is  finitely  failed. 


14 


5  Conclusion  and  Future  Work 

We  introduced  the  notion  of  confluence  for  Constraint  Handling  Rules  (CHR).  Con¬ 
fluence  guarantees  that  a  CHR  program  will  always  compute  the  same  result  for  a 
given  set  of  user-defined  constraints  independent  of  which  rules  are  applied. 

We  have  given  a  characterization  of  confluent  CHR  programs  through  joinability 
of  critical  pairs,  yielding  a  decidable,  syntactically  based  test  for  confluence.  We 
have  shown  that  confluence  is  a  sufficient  condition  for  logical  correctness  of  CHR 
programs.  Correctness  is  an  essential  property  of  constraint  solvers. 

We  also  gave  various  soundness  and  completeness  results  for  CHR  programs. 
Some  of  these  theorems  are  stronger  than  what  holds  for  the  related  families  of 
(concurrent)  constraint  programming  languages  due  to  correctness. 

Our  approach  complements  recent  work  [M095]  that  gives  confluent,  non-standard 
semantics  for  CC  languages  to  make  them  amenable  to  abstract  interpretation  and 
analysis  in  general,  since  our  confluence  test  can  find  out  parts  of  CC  programs 
which  are  confluent  already  under  the  standard  semantics. 

Current  work  integrates  the  two  other  kinds  of  CHR  rules,  the  propagation  and 
the  simpagation  rules,  into  our  condition  for  confluence.  We  are  also  developing 
a  tool  in  ECL'PS®{ECRC  Constraint  Logic  Programming  System  [Ecl94])  which 
tests  confluence  of  CHR  programs.  Preliminary  tests  show  that  most  existing  cons¬ 
traint  solvers  written  in  CHR  are  indeed  confluent,  but  that  there  are  inherently 
non-confluent  solvers  (e.g.  performing  Gaussian  elimination),  too.  We  also  plan  to 
investigate  completion  methods  to  make  a  non-confluent  CHR  program  confluent. 
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Abstract.  Numerous  arc  consistency  algorithms  have  been  developed  for  filtering 
constraint  satisfaction  problems  (CSP).  But,  few  of  them  considered  the  semantic  of  the 
constraints.  Arc  consistency  algorithms  work  with  a  queue  containing  element  to 
reconsider.  Then,  some  constraints  may  be  checked  many  times.  Recently,  Liu  has  proposed 
an  improved  specific  version  AC5+  of  the  AC5  algorithm,  AC5+  deals  with  a  subclass  of 
functional  constraints,  called  "Increasing  Functional  Constraints  (IFC)".  It  allows  some 
IFC  constraints  of  a  CSP  to  be  checked  only  once,  when  achieving  arc  consistency.  In  this 
paper,  we  propose  a  labelling  arc  consistency  method  (LAC)  for  filtering  CSPs  containing 
functional  constraints.  LAC  uses  two  concepts:arc  consistency  and  label-arc  consistency.  It 
allows  all  functional  constraints  to  be  checked  only  once,  and  some  general  constraints  to 
be  checked  at  most  twice.  Although,  the  complexity  of  LAC  is  still  in  0(ed)  for  functional 
constraints,  where  e  is  the  number  of  constraints  and  d  the  size  of  the  largest  domain,  the 
technique  used  in  LAC  leads  to  improve  the  performances  and  the  effectiveness  of  classical 
arc  consistency  algorithms  for  CSPs  containing  functional  constraints.  The  empirical 
results  presented  show  the  substantial  gain  brought  by  the  LAC  method. 

I  Introduction 

A  constraint  satisfaction  problem  consists  of  assigning  values  to  variables  which  are 
subject  to  a  set  of  constraints. 

Numerous  arc  consistency  algorithms  have  been  developed  for  filtering  constraint 
satisfaction  problems  (CSP)  [1,  2,  7]  before  or  during  the  search  for  a  solution.  But, 
few  of  them  considered  the  semantic  of  the  constraints.  Important  classes  of  constraints 
(functional,  anti-functional,  monotonic,...)  have  been  studied  in  the  last  years.  These 
types  of  constraints  arise  in  several  concrete  applications,  such  as  job  scheduling  [6], 
and  constraint  logic  programming  languages  [4,  10].  The  basic  constraints  used  in 
these  languages  are  special  cases  of  functional  and  monotonic  constraints.  David  [3] 
has  proposed  a  filtering  algorithm  "Pivot  consistency"  for  the  functional  constraints. 
Van  Hentenryck  and  al.  [11]  have  developed  a  generic  arc  consistency  algorithm  AC5. 
This  algorithm  aclueves  arc  consistency  in  0(ed)  on  functional,  anti-functional  and 
monotonic  constraints,  where  e  is  the  number  of  constraints  and  d  is  the  size  of  the 
largest  domain.  Arc  consistency  algorithms  work  with  a  queue  containing  the  elements 
to  reconsider.  Then,  some  constraints  may  be  rechecked  many  times  during  the 
constraint  propagation  process.  More  recently,  Liu  [6]  has  proposed  an  improved 
version  AC5+  of  the  AC5  algorithm.  AC5+  deals  with  a  subclass  of  functional 
constraints,  called  "Increasing  Functional  Constraints  (IFC)".  It  allows  some  IFC 
constraints  of  a  CSP  to  be  checked  only  once,  when  achieving  arc  consistency. 
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In  this  paper,  we  propose  a  labelling  arc  consistency  method  (LAC)  for  filtering  CSPs 
containing  functional  constraints.  This  method  is  based  on  the  following  two 
properties  of  functional  constraints.  The  first  is  that,  if  for  any  two  variables  Xi  and  Xj 
of  a  CSP,  there  exists  a  sequence  of  functional  constraints  between  Xi  and  Xj,  then 
each  value  of  a  domain  participates  in  at  most  one  solution  of  this  CSP.  The  second  is 
that  the  composition  of  a  set  of  functional  constraints  is  a  functional  constraint.  These 
properties  are  exploited  for  developing  an  algorithm  allowing  all  fiinctional  constraints 
to  be  checked  only  once,  and  some  general  constraints  to  be  checked  at  most  twice. 
This  is  achieved  by  affecting  a  label  to  each  value  and  each  domain  of  a  CSP,  and 
defining  a  local  consistency  concept,  called  label-arc  consistency,  by  modifying 
slightly  the  arc  consistency  concept.  The  labelling  of  domains  and  values  allows  the 
keeping  trace  of  solutions  to  which  participate  some  values.  LAC  applies  the  label-arc 
concept  only  on  the  constraints  Cij  such  that  the  domains  Di  and  Dj  have  the  same 
label.  This  leads  to  improve  the  performances  and  the  effectiveness  of  classical  arc 
consistency  algorithms  for  filtering  CSPs  containing  functional  constraints.  Naturally, 
this  technique  can  be  embedded  in  any  general  arc  consistency  algorithm. 

LAC  deals  not  only  with  the  class  of  functional  constraints  containing  the  class  of 
IFC  constraints  considered  by  AC5+,  but  also  with  some  general  constraints.  AC5+ 
does  not  guarantee  that  any  IFC  constraint  of  a  CSP  will  be  checked  only  once,  while 
LAC  checks  all  functional  constraints  only  once,  and  in  addition  some  general 
constraints  are  checked  at  most  twice.  Then,  the  proportion  of  constraints  checked 
uselessly  many  times  by  AC5  and  AC5+  is  diminished  using  the  LAC  method.  The 
performances  and  the  effectiveness  of  LAC  on  CSPs  containing  functional  constraints 
are  more  important  than  the  classical  arc  consistency  algorithms,  since,  in  one  hand, 
this  method  avoids  to  check  many  times  certain  constraints,  and  in  other  hand,  LAC 
allows  in  some  cases  the  solving  of  the  CSP  or  the  improving  of  the  effectiveness  of 
the  filtering. 

Another  advantage  of  the  LAC  method  is  its  incremental  behaviour.  If  we  want  to  add 
a  constraint  to  a  label-arc  consistent  CSP  (P),  the  functional  constraints  and  some 
general  constraints  of  (P)  will  not  be  checked  for  making  the  new  CSP  label-arc 
consistent.  The  empirical  results  presented  show  the  substantial  gain  brought  by  the 
LAC  method. 

The  rest  of  this  paper  is  organized  as  follows.  Section  2  reviews  some  needed 
definitions  on  the  constraint  satisfaction  framework.  Section  3  describes  the  related 
works.  Section  4  presents  the  LAC  method.  Section  5  presents  empirical  results  and 
section  6  concludes. 

2.  Preliminaries 

In  this  section,  we  present  the  formalism  of  CSPs  introduced  by  Montanari  [9]. 

A  binary  CSP  P  is  defined  by  (X,  D,  C,  R),  where: 

-X  is  a  set  of  n  variables  {Xi,...,  Xn); 

-D  is  a  set  of  n  domains  {Di,...,  Dn)  where  Dj  is  the  set  of  all  possible  values  for  Xi; 
-C  is  a  set  of  m  constraints  where  Cy  (i<j)  is  the  constraint  between  the  variables  Xi 
and  Xj  is  defined  by  its  relation  Ry  ; 

-R  is  a  set  of  m  relations  Rij,  where  Ry  is  a  subset  of  the  Cartesian  product  Di  x  Dj 
specifying  the  compatible  values  between  Xi  and  Xj. 

The  constraint  graph  represents  variables  and  constraints  of  the  CSP  in  the  form  of  a 
network,  where  each  variable  is  represented  by  a  vertex  and  each  constraint  by  an  edge. 
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For  all  constraint  Qj,  the  predicate  Rij(vr,Vs)  holds  if  and  only  if  (vr,Vs)  belongs  to 
the  relation  Ry. 

We  recall  now  the  definitions  of  arc  consistency,  functional  constraint  and  increasing 
functional  constraint. 

Definition  1  A  domain  Dj  of  D  is  arc  consistent  iff,  for  each  VfeDi,  ^  XjeX  such  that 

CijG  C,  there  exists  v^e  Dj  such  that  Rij(vr,Vs).  A  CSP  is  arc  consistent  iff  V  Die  D,  Dj  is  arc 
consistent  and  Dj  not  empty. 

DeHnition  2  A  constraint  Cy  is  functional  iff  for  all  Vj-  e  Di  (resp.  Vs  g  Dj)  there 
exists  at  most  one  Vg  e  Dj  (resp.  Vf  e  Dj)  such  that  Rij(vr,  Vs).  We  note  Vs=fii(vr) 
and  vr=^i(vs). 

Definition  3  A  constraint  Cy  is  an  Increasing  Functional  Constraint  (IFC)  iff  Cy 
is  functional  and  for  all  Vj-,  Vg  g  Di  such  that  fy(vi-)  and  fyCvg)  exist  in  Dj  then  Vf  < 
Vs  implies  fy(vr)  <  fij(vs). 

Observe  that  if  Cy  is  an  IFC  then  the  constraint  is  functional,  the  reciprocal  is  false. 

3.  Related  Work 

Our  work  is  motivated  by  the  ACS  and  AC5+  algorithms  developed  by  Van 
Hentenryck  and  al.  [11],  and  Liu  [6],  respectively.  ACS  is  a  generic  consistency 
algorithm  which  can  be  specialized  for  functional  constraints,  anti-functional 
constraints,  monotonic  constraints,  and  their  piecewise  constraints.  As  classical  arc 
consistency  algorithms,  ACS  manages  a  queue  containing  elements  to  be  reconsidered, 
then  some  constraints  can  be  checked  uselessly  many  times. 

ACS+  proposed  an  improved  version  of  ACS.  It  allows  some  IFC  constraints  of  a 
CSP  to  be  checked  only  once,  when  achieving  arc  consistency.  The  drawbacks  of 
ACS+  is  that  its  use  is  limited  to  a  restrictive  class  of  functional  constraints.  It  even 
does  not  guarantee  that  any  IFC  constraint  of  a  CSP  will  be  checked  only  once,  since 
this  depends  on  the  order  in  which  the  constraints  are  checked.  There  are  also  many 
general  constraints  checked  uselessly  many  times. 

The  proposed  method  LAC  checks  all  functional  constraints  only  once,  and  some 
general  constraints  at  most  twice.  Then,  the  proportion  of  constraints  checked 
uselessly  many  times  is  diminished.  The  effectiveness  of  LAC  on  CSPs  containing 
functional  constraints  outperforms  the  classical  arc  consistency  algorithms  since  this 
method  uses  a  filtering  concept  more  general  than  the  arc  consistency  concept. 
Although  our  method  is  still  in  0(ed)  for  functional  constraints,  the  computational 
experiments  show  that  it  is  more  efficient  than  ACS  and  AC5+. 

4.  The  LAC  algorithm 


We  begin  this  section  with  some  needed  definitions.  Then  we  present  the  core  of  the 
technique  used  in  LAC,  and  the  corresponding  algorithm. 

DeHnition  4  Assume  that  the  domains  of  a  CSP  and  the  values  of  each  domain  are 
labelled.  A  value  Vr  of  Di  is  label-arc  consistent  iff  for  each  domain  Dj  having  the 
same  label  as  Di,  there  exists  a  value  Vg  such  that  we  have  RyCVf,  Vg),  and  the  values 
Vr  and  Vg  have  the  same  label. 
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Definition  5  A  domain  Dj  is  label-arc  consistent  iff  each  value  of  Dj  is  label-arc 
consistent.  A  CSP  (P)  is  label-arc  consistent  iff  each  domain  of  (P)  is  label-arc 
consistent  and  not  empty. 

Example  Assume  that  Di  and  D2  have  the  same  label,  the  values  a  and  b  of  Di  are 
labelled  by  1  and  2  respectively,  and  the  values  a  and  b  of  D2  are  labelled  by  2  and  3 
respectively.  Although  the  domains  Di  and  D2  are  arc  consistent,  applying  label-arc 
consistency  leads  to  a  removing  of  values  a  from  Di  and  b  from  D2,  since  these  values 
have  different  labels. 


4.1  Functional  constraints 

The  LAC  method  is  based  on  two  important  properties  of  functional  constraints.  The 
first  property  exploited  by  the  LAC  method  leads  to  check  all  functional  constraints 
only  once,  while  the  second  property  allows  the  checking  of  some  general  constraints 
at  most  twice. 

Property  1  Let  (P)  be  a  CSP,  and  assume  that  for  each  pair  of  variables  Xj  and  Xj  of 
(P),  there  exists  a  sequence  of  functional  constraints  between  Xj  and  Xj.  Then  for  each 
value  Vf  of  a  domain  Dj,  there  exists  at  most  one  solution  such  that  Xj  takes  the  value 
vr- 

Proof  Assume  that,  for  a  value  Vj-  of  D]  there  exists  two  different  solutions  SI  and  S2 
of  (P)  such  that  Xj  takes  the  value  Vp .  Then,  there  exists  a  variable  Xj  of  (P)  such  that 
Xj  takes  two  different  values  in  SI  and  S2.  Either  there  is  a  constraint  between  Xj  and 
Xj  which  is  not  functional,  or  for  each  sequence  of  constraints  between  Xj  and  Xj  there 
is  at  least  one  of  them  which  is  not  functional.  Otherwise,  implicitly  Xj  and  Xj  are 
linked  with  a  functional  constraint,  since  the  composition  of  a  sequence  functional 
constraints  is  a  functional  constraint.  ♦ 

Before  presenting  the  second  property,  we  first  describe  the  labelling  process  which 
will  be  used  for  proving  this  property. 

The  principle  of  LAC 

As  classical  arc  consistency  algorithms,  the  LAC  method  performs  the  filtering  in  two 
phases.  The  first  phase  deals  with  the  checking  of  constraints  and  the  labelling  process. 
While,  the  second  phase  deals  with  the  constraint  propagation  process. 

Assume  that  initially  each  domain  Di  is  labelled  by  i  and  each  value  Vp  of  Di  is 
labelled  by  r.  The  labelling  process  is  performed  during  the  first  phase  as  follows. 
Progressively,  the  constraints  of  a  CSP  are  checked.  When  a  functional  constraint  Cy 
is  encountered,  we  perform  one  of  the  following  treatments. 

(i)  -  Either  the  domains  Di  and  Dj  are  labelled  by  i  and  j  respectively,  in  which  case  arc 
consistency  between  these  domains  is  applied.  For  each  tuple  (vp,  Vs)  of  Rij,  the  value 
Vs  will  be  labelled  by  r.  The  domain  Dj  will  be  labelled  by  i. 
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(ii)  -  Or  the  domains  Dj  and  Dj  are  labelled  by  k  (k^i  and  k^tj)  and  j  respectively,  arc 
consistency  between  these  domains  is  applied.  For  each  tuple  (vr,  Vg)  of  Ry,  the  value 
Vg  will  be  labelled  by  the  same  label  as  the  one  of  v^.  The  domain  Dj  will  be  labelled 
by  k.  The  case  when  Dj  and  Dj  are  labelled  by  i  and  k  (k^^j  and  kt^i)  respectively  is 
similar  to  (ii). 

(iii)  -  Or,  the  labels  of  the  domains  Dj  and  Dj  are  different,  let  ki  and  k2  be  the  labels 
of  Di  and  Dj  respectively.  Arc  consistency  between  the  domains  Di  and  Dj  is  applied. 
For  each  tuple  (vj,  Vg)  of  Ry,  the  value  Vg  will  be  labelled  by  the  same  label  as  the 
one  of  Vf.  The  domain  0^2  will  be  labelled  by  ki.  Notice  that  all  the  domains  which 
were  labelled  with  k2  will  be  implicitly^  labelled  with  ki,  and  the  values  of  domains 
having  the  same  label  as  Vg  will  be  implicitly^  labelled  with  the  label  of  Vj*. 

(iv)  -  Or  the  domains  Di  and  Dj  have  the  same  label,  in  which  case  label-arc 
consistency  between  the  domains  Dj  and  Dj  is  applied. 

In  all  cases  the  constraint  Cy  will  not  be  reconsidered. 

When  a  general  constraint  Cy  is  encountered  :  either  the  domains  Dj  and  Dj  have  the 
same  labels,  label-arc  consistency  between  these  domains  is  applied,  and  this  constraint 
will  not  be  reconsidered,  or  arc  consistency  between  these  domains  is  applied. 

Note  that  in  (i),  (ii)  and  (iii),  the  labelling  process  is  propagated  over  the  domains  and 
the  values. 

During  the  second  phase,  the  constraint  propagation  process  leads  to  recheck  only 
some  general  constraints.  When  a  label-arc  consistency  is  performed  on  a  general 
constraint,  this  constraint  will  not  be  reconsidered. 

Intuitively,  the  LAC  algorithm  consists  in  partitioning  the  set  of  domains  of  a  CSP 
into  a  subsets  of  domains.  Each  subset  Dfc  groups  the  domains  having  a  same  label  k, 
and  is  represented  by  the  domain  D^. 

Definition  7  D^  is  a  representative  domain  if  it  is  labelled  by  k.  A  representative 
value  Vf  of  a  representative  domain  is  a  value  labelled  by  r. 

We  now  present  the  second  property  showing  that  some  general  constraints  will  be 
considered  as  a  functional  constraint  after  applying  label-arc  consistency.  Then  these 
constraints  will  not  be  reconsidered  during  the  constraint  propagation  process. 

Property  2  Let  Ciji2,  Ci2i3,...,  Qj^-lik  ^  sequence  of  functional  constraints,  and 
Cipiq  l<p<q^  a  general  constraint.  After  applying  label-arc  consistency  between  the 
domains  Djp  and  Di^,  for  each  value  Vj-  €  Dip  (resp.  Vg  e  Di^)  labelled  by  Ip  (resp.  Ig) 
there  exists  exactly  one  value  Vg  in  Di^  (resp.  Vp  in  Dip)  labelled  by  Ip  (resp.  Ig)  such 
that  Ry(vp,  Vg)  holds. 

Proof  Since  the  constraints  Ci|i2,  Ci2i3,...,  Qk-Dk  functional,  the  labelling 
process  leads  to  label  the  domains  Dip  and  Di^  by  a  same  label  (the  index  of  their 
representative  domain).  And,  in  Dip  (resp.  Di^)  there  is  no  two  values  having  a  same 


^The  domains  wiH  not  be  effectively  labelled  again,  the  new  labelling  of  a  domain  can  easily  be  deduced 
when  this  domain  is  manipulated  (see  the  ComputeLabelDomain  procedure). 

^It  means  that  the  labels  of  these  values  can  be  deduced  (see  the  ComputeLabelValue  procedure). 
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label.  Since  the  constraint  Cy  is  implicitly  functional  (this  constraint  is  the 
composition  of  functional  constraints),  then  after  applying  label-arc  consistency 
between  Djp  and  Di^,  for  each  value  Vf  e  Djp  (resp.  Vs  e  Dj^)  labelled  by  Ij-  (resp.  Is) 

there  exists  exactly  one  value  Vs  in  Di^  (resp.  Vf  in  Dip)  labelled  by  If  (resp.  Is)  such 
that  Rij(vi-,  Vs)  holds.  ♦ 

We  can  associate  to  each  subset  of  domains  a  subproblem  CSP  Pfc  satisfying  the 
hypothesis  of  Property  1  and  for  each  general  constraint  Cjj  of  Pjc,  there  is  a  sequence 
of  functional  constraints  between  Xj  and  Xj.  These  general  constraints  are  in  reality 
functional  constraints  since  the  composition  of  a  set  of  functional  constraints  is  a 
functional  constraint.  Then,  by  Property  2,  after  applying  label-arc  consistency,  the 
labelling  allows  the  identification  of  these  general  constraints^  which  are  in  reality 
functional.  For  instance,  thanks  to  the  labelling,  the  general  constraint  Cy  ,  where  Dj 
and  Dj  have  a  same  label. 


is  treated  as  a  functional  constraint  by  our  method.  In  the  rest  of  this  paper,  we  call 
these  constraints  implicit  functional  constraints. 

If  a  modification  must  be  made  on  any  domain  of  this  modification  will  be 
effectively  made  on  the  representative  domain  D^.  Namely,  if  a  value  Vs  must  be 
removed  from  a  domain  Dj  labelled  by  k,  we  remove  effectively  its  representative  value 
from  Dk,  since  by  Property  1,  the  value  Vg  participates  to  a  solution  of  a  CSP  if  and 
only  if  its  representative  value  in  Dk  participates  also  to  that  solution.  This  leads  not 
to  reconsider  all  functional  constraints. 

The  algorithm  The  first  phase  of  LAC  consists  in  checking  the  constraints  and 
labelling  the  domains.  The  second  phase  deals  with  the  constraint  propagation  process. 
Two  arrays  are  used  for  stocking  the  labels.  An  array  label_domain,  initialized  as 
label_domain[i]  =  i  for  i=l,...,  n,  indicates  that  each  domain  is  represented  by  itself. 
And  an  array  label_value,  initialized  as  label_value[i]lj]=j  for  i=l,...,  n,  j=l,...,dj, 
where  df  is  the  size  of  the  domain  Dj. 


LAC(CSP) 
begin 
(phase  1) 

for  each  constraint  Cjj  of  C; 

If  Cjj  is  function^ 

then  FunctionalTreatn)ent(i,  j); 

else  NonFunctionalTreatment(i,  j); 

(phase  2) 

while  L  is  not  empty 

consider  an  element  (kl,  label  1)  of  L; 
while  Ski  label  1  is  not  entity 

consider  an  element  (i,  r,  j,  s)  of  S^l, label  1  * 
k2  =  ComputeLabelDomain(i); 

If  (kl=k2) 

then  CheckLabelNonFunctional(i, j,k  1 ); 


^Note  that  our  method  does  not  modify  these  constraints,  the  fact  that  these  constraints  are  functional  is 
indicated  by  the  labels. 
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end. 


else  perform  arc  consistency  between  Dj  and  Dj"*; 

L<—  L  U  {the  set  of  representatives  inconsistent  values  of  Dj  and  Dj }; 


Fig.  1. 


Figure  1  shows  the  LAC  algorithm.  During  the  first  phase,  LAC.  checks  each 
constraint  exactly  once,  and  uses  two  procedures  which  provide  a  specific  treatment  for 
functional  constraints  and  general  constraints.  During  the  second  phase,  LAC  uses  the 
procedure  CheckLabelNonFunctional(i,  j)  which  is  called  when  the  domains  are 
labelled  by  a  same  label.  Note  that  L  contains  the  current  representative  values  to 
remove. 

The  procedures  in  Figures  2  and  3  present  the  specific  treatment  devoted  to  functional 
and  non-functional  constraints  respectively.  Two  possible  cases  are  examined  by  these 
procedures.  Either  the  labels  of  the  domains  Df  and  Dj  are  equal  (this  means  that  there 
exists  a  sequence  of  functional  constraints  between  Xi  and  Xj),  in  which  case  the 
CheckLabelFunctional  (resp.  CheckLabelNonFunctional)  procedure  is  called  for 
performing  label-arc  consistency  between  the  domains  Dj  and  Dj.  Or,  Dj  and  Dj  have 
two  different  representative  domains  D^i  and  Dk2  respectively.  In  the  case,  the 
FunctionalTreatment  procedure  propagates  the  labelling  in  order  to  make  all  domains, 
which  are  labelled  with  ki  and  k2  have  the  same  representative  domain  D^i  or  Dk2- 
While  the  NonFunctionalTreatment  procedure  performs  arc  consistency  between  the 
domains  Dj  and  Dj. 


FunctionalTreatment(i,  j) 

NonFunctionalTreatmentO,  j) 

begin 

begin 

kl=ComputeLabeIDomain(i) 

k  1  =ComputeLabelDomain(i) 

k2=ComputeLabelDomain(j) 

k2=ComputeLabelDomain(j) 

If  (kl=k2)  then 

CheckLabelFunctional(i,  j,  kl); 

else 

PropagateLabeI(i,  j,  kl,  k2); 

end. 

If  (kl=k2)  then 

CheckLabelNonFunctional(i,  j,  kl); 

else 

perform  arc  consistency  between  Dj  and  Dj; 
L<—  L  U  (the  set  of  representatives 
inconsistent  vdues  ofDjandD;}; 

end. 

Fig.  2.  Fig.  3. 


The  PropagateLabel  procedure  consists  in  propagating  the  labels  from  Djty  to  D]^2- 
Let  Dj  and  Dj  be  two  domains  labelled  by  kj  and  k2  respectively,  and  Cjj  a  functional 
constraint.  This  procedure  labels  the  representative  domain  Dk2  by  kj,  then  the 
domains  of  the  two  sets  will  have  a  same  label  kl.  When  a  domain  Dit2  is  labelled  by 
ki,  we  transfer  all  necessary  information^  concerning  DyI  ^  ih®  representative  domain 
Dki.  In  order  to  do  this,  we  use  the  notion  of  first  support  introduced  in  AC6  [1],  by 
modifying  slightly  the  data  structure  of  this  algorithm.  We  recall  that  this  algorithm 
manages,  for  each  value  (X^,  vi),  a  list  representing  the  set  of  values  for  which 
Xk»vi)  is  the  first  support.  In  our  case,  we  manage  a  list  Sk,l  which  contains  elements 


use  the  specific  treatment  of  arc  consistency  algorithm  ACS  devoted  to  different  types  of  constraints 
(anti-fiinctiond,  monotonic,...). 

^The  information  necessary  for  making  the  constraint  propagation  process. 
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of  the  form  (i,rj,s)  for  which  the 
value  (Xj,  Vs)  is  the  first  support 
of  the  value  (Xj,  Vf)  relatively  to 
the  constraint  Cy,  and  vi  e 
is  the  representative  value  of  Vs 
e  Dj.  This  allows  us  to  keep  a 
trace  for  searching  a  new  first 
support  for  the  value  (Xj,  Vf) 
when  the  current  representative 
value  (Xk,  vj)  is  removed.  This 
indicates  that  the  new  first 
support  will  be  searched  in  Dj. 

Thus,  the  transfer  of  the 
information  concerning  Dk2  to 
the  representative  domain  D^i  is 
made  by  updating  the  lists  S^l, 

11  for  11=1,. ..,dkl  where  dkl  is 
the  size  of  Dkl-  This  is  done 
with  at  most  d  operations. 

It  suffices  to  update  some  pointers.  Namely,  to  perform  this  we  concatenate  the  lists 
Sk2,12  to  Ski, 11  for  each  pair  of  values  vn  and  vi2  of  Dkl  and  Dk2  where  vn  and  vi2 
are  the  representative  values  of  Vf  e  Dj  and  fij(vr)  e  Dj. 


ftopagateLabeKi,  j,  kl,  k2) 
begin 

domain  empty; 
for  each  value  Vj-  6  Dj 
labeIl=ComputeLabelValue(i,  r); 
if  (labell) 

then  i/(vs=fij(vr))  e  Dj 
then 

label2=ComputeLabelVaIue(j,  s); 

If(not(Iabel2)) 
then  L<-  L  U  {(kl,  labell)}; 
remove  viabeu  from 

else 

^kl,viabell^  ^k2.viabel2^ 

Iabel_value[k2][label2]=labell ; 
domain  domain  U  {vg}; 

else 

L<-  L  U  {(kl.  labell)}; 

emove  viabell 

for  each  value  v^  e  Dj  and  Vg  g  domain 

do  label=ComputeLabelValue(i,  s); 

if  (label)  then  L<-  L  U  {(k2,  label)}; 

remove  v|abel  fro™  t^2’ 
labeLdomain[k2]=kl ; 
end. 

Fig.  4. 


The  procedures  in  figures  5  and  6  perform  label-arc  consistency  between  the  domains 
Dj  and  Dj.  The  CheckLabelFunctional  procedure  deals  with  the  functional  constraints 
while  the  CheckLabelNonFunctional  deals  with  general  constraints. 


CheckLabelFunctional(i,  j,  k) 
begin 

domain  f-  empty; 

for  each  value  Vj-  e  Dj 

label  1  =ComputeLabelV alue(i,  r) ; 

if  (labell) 

then  if  (vs=fy(vr))  e  Dj 
then 

label2=ComputeLabelValue(J,  s); 
if  (not(label2))  then  L<-  L  U  {(k,  labell)}; 

remove  v|abell  from  Dj^; 
else  if  label  l9t  label2)  then 

L<-LU  {(k,  labell)}; 
remove  viabell  fro™  ^ki 
Lf-LU  {(k,  label2)}; 
remove  viabel2  fro™  D^; 
else  domain  i~  domain  U  {Vg}; 

else 

L<-LU{(k,  labell)}; 
for  each  value  v^  e  Dj  and  Vg  6  domain 
do  label=ComputeLabelValue(j,  s); 

if  (label)  then  L<-  L  U  {(k,  label)}; 

remove  viabel  f™™  D^; 

end. 


CheckLabelNonFunctional(i,  j,  k) 
begin 

domain  <-  empty 

for  each  value  VfE  Dj 

do  labell=ComputeLabelValue(i,  r); 

if  (labell) 

then  support=false; 

for  each  value  Vg  6  Dj 
do  if  Rij(vr,Vs) 
then 

label2=:ComputeLabelValue(j,  s); 
if  (label  l=label2)  then 
support=true; 
domain  ^  domain  U  {Vg); 
break', 

if  (not(support))  then  L<“  L  U  {(k,  labell)}; 

remove  vjabell  fro™  Dk^ 
for  each  value  v^  €  Dj  and  Vg  ft  domain 
do  label=ComputeLabelValue(j,  s); 
if  (label)  then  L<-  L  U  {(k.  label)}; 

remove  viabel  fro™  D^; 

end. 


Fig.  5. 


Fig.  6. 
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The  procedures  in  figures  7  and  8  supply  the  label  of  a  domain  and  the  label  of  a  value 
respectively.  The  function  ComputeLabelDomain  returns  the  label  k  of  a  domain  D[. 
Die  is  the  representative  domain  of  Dj.  The  function  ComputeLabelValue  returns  the 
label  of  a  value  Vj-  or  false  when  the  representative  value  of  Vj-  is  already  removed  (this 
means  that  Vf  is  inconsistent). 


ComputeLabelDomain(i) 

ComputeLabeIValue(i,  r) 

begin 

begin 

rep=i 

label=  r; 

while  (label_domain[rep]  ^  rep) 

rep=i; 

do  rep  =  label_domain[rep]; 
return  (rep); 
end. 

while  (label_doniain[rep]  ^  rep) 
do  latel  =  label_value[rep][la^l]; 
rep  =  Iabel_domain[rep]; 

(viabel  ^  Drep)  then  remm(false); 
re/«m(label); 
end. 

Fig.  7.  Fig.  8. 


An  illustrating  example 

Let  us  consider  the  following  CSP  whose  constraint  graph  is  : 


The  functional  constraints  are  represented  by  bold  edges. 

The  following  8  points  show  the  treatment  performed  by  LAC  during  the  first  phase 
on  the  CSP.  The  order  of  constraint  checking  is  chosen  in  the  aim  to  have  a  good 
illustration  of  the  LAC  method. 

1)  The  constraint  (1)  is  general,  then  arc  consistency  is  performed  between  Dj  and  02- 

2)  The  constraint  (2)  is  functional,  then  arc  consistency  is  performed  between  Di  and 
D4,  and  the  domain  D4  is  labelled  by  1. 

3)  The  constraint  (3)  is  functional,  then  arc  consistency  is  performed  between  D2  and 
D3,  and  the  domain  D3  is  labelled  by  2. 

4)  The  constraint  (4)  is  functional,  then  arc  consistency  is  performed  between  D4  and 
D2,  and  the  domain  D2  is  labelled  by  1,  since  D4  is  already  labelled  by  1. 

5)  The  constraint  (5)  is  general,  Di  and  D3  have  the  same  label  1  since  D2  is  labelled 
by  1,  then  label-arc  consistency  is  performed  between  Di  and  D3. 

6)  The  constraint  (6)  is  general,  D3  and  D4  have  the  same  label  1,  then  label-arc 
consistency  is  performed  between  D3  and  D4. 

7)  Once  the  constraints  7,  8,  9  and  10  are  checked,  we  perform  arc  consistency  between 
the  domains  D4  and  D5,  D3  and  D5,  Di  and  D^,  Di  and  D7. 

8)  The  constraint  (1 1)  is  functional,  then  arc  consistency  is  performed  between  and 
D7,  and  the  domain  D7  is  labelled  by  6. 

Remark  that  the  set  of  domains  of  this  problem  is  partitioned  into  3  subsets  of 
domains.  Dj  ={Di,  D2,  D3,  D4}  the  subset  of  domains  labelled  by  1  (their 
representative  domain  is  DI),  ={D6,  D7}  the  subset  of  domains  labelled  by  6,  and 
^5  ={t>5)i  the  domain  D5  is  represented  by  itself. 
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During  the  phase  2,  all  functional  constraints  and  the  general  constraints  1  and  5  will 
not  be  reconsidered.  The  constraint  6  is  reconsidered  at  most  once. 

The  following  theorem  shows  the  completeness  of  the  LAC  algorithm. 

Theorem  1  Let  (P)  be  a  CSP  and  (P')  the  CSP  obtained  after  applying  the  LAC 
algorithm.  The  CSP  (F)  is  label-arc  consistent. 

Proof  As  classical  arc  consistency  algorithms,  LAC  proceeds  in  two  phases.  In  the 
first,  each  constraint  is  checked  once,  and  a  queue  L  of  elements  to  reconsider  in  the 
second  phase  is  built.  At  the  end  of  the  first  phase,  the  set  of  domains  of  a  CSP  are 
partitioned  into  subsets  D Dp  :  p  <  n  of  domains.  Each  of  these  subsets  groups 
together  the  domains  having  a  same  label  k,  and  is  represented  by  the  domain  D^.  Note 
that  we  can  associate  to  each  subset  of  domains  a  subproblem  CSP  P^. 

The  second  phase  performs  the  constraint  propagation  process.  We  will  show,  in  this 
phase,  that  it  is  not  necessary  to  reconsider  the  functional  constraints,  nor  to  reconsider 
any  general  constraint  of  a  subproblem  P^  more  than  once.  Let  Di  e  and  vj  be  a 
value  of  Dj,  and  k  the  label  of  Dj.  Assume  that  vj  is  detected  inconsistent,  then  the 
representative  value  Vr  (vj-  e  D^)  of  vj  is  enqueued  in  L.  The  propagation  process  leads 
to  check  the  constraints  of  P^  and  the  general  constraints  linking  the  variables  of  Pk  to 
other  subproblems.  The  functional  constraints  of  each  Pfc  will  not  be  checked  since  by 
property  1  there  is  at  most  one  solution  containing  the  value  vj  satisfying  the 
constraints  of  Pk,  and  this  solution  is  represented  by  the  value  Vy  of  The  general 
constraints  of  Pk  will  be  checked  at  most  once,  since  when  a  general  constraint  of  Pk 
is  considered  in  this  phase,  the  label-arc  consistency  concept  is  applied  to  this 
constraint,  and  from  property  2  this  constraint  will  be  an  implicit  functional  constraint 
and  then  not  reconsidered.  The  other  non  functional  constraints  are  treated  as  in  ACS 
algorithm  using  the  labelling  and  the  notion  of  representative  domain  and  value.  ♦ 

4.2  Analysis 

We  present  here  some  results  showing  that  the  technique  used  in  LAC  makes  this 
algorithm  more  powerful  than  classical  arc  consistency  algorithms.  Namely,  we  show, 
on  the  one  hand,  that  LAC  allows  all  functional  constraints  to  be  checked  only  once, 
and  some  general  constraints  at  most  twice.  And,  on  the  other  hand,  some  cases  of 
CSPs  can  be  solved  by  only  achieving  LAC,  without  backtracking. 

Note  As  shown  in  4.1,  LAC  uses  sometimes  the  label-arc  consistency  concept  for 
filtering  the  domains  of  a  CSP,  then  it  is  clear  that  the  effectiveness  of  LAC 
outperforms  the  one  of  classical  arc  consistency  algorithms. 

Theorem  2  Let  (P)  be  a  CSP.  If  all  the  constraints  of  (P)  are  functional,  then  (P)  can 
be  solved  by  applying  LAC.  Furthermore,  each  constraint  of  (P)  is  checked  only  once. 

Proof  The  LAC  algorithm  checks  the  constraints  of  (P)  progressively.  Then,  when  a 
constraint  Cij  is  checked,  either  the  domains  Df  and  Dj  have  a  same  label,  in  which 
case  the  label-arc  consistency  concept  is  applied  to  this  constraint.  If  for  every  tuple 
(Vf,  Vs)  of  Rjj,  the  labels  of  Vf  and  Vs  are  different,  we  conclude  that  (P)  has  no 
solution  (since  their  representative  domain  becomes  empty).  Or,  the  domain  Di  is 
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labelled  by  kl  and  Dj  by  k2  (kl9tk2),  in  which  case  the  classical  arc  consistency 
concept  is  applied,  and  for  each  tuple  (vj-,  Vs)  of  Rjj,  the  values  Vj*  and  will  have  a 
same  label.  Assume  that  after  checking  each  constraint  of  (P)  once,  all  the  domains  are 
not  empty;  we  conclude  that  (P)  has  a  solution.  Let  Xi  be  a  variable  of  (P)  and  Vp  a 
value  of  Di.  Since  the  constraints  of  (P)  are  all  functional,  then  by  property  1  there 
exists  at  most  one  solution  of  (P)  such  that  Xi=Vr.  The  values  of  other  variables 
forming  this  solution  will  have  the  same  label  as  Vf.  All,  solutions  of  (P)  can  be 
found  by  taking  for  each  of  them  the  values  having  the  same  label.  Then,  (P)  has  a 
solution  if  and  only  if  all  the  domains  of  (P)  are  not  empty  after  applying  the  LAC 
algorithm.  ♦ 


Example  Let  us  consider  the  following  sample  CSP  with  3  variables  and  3 
functional  constraints  whose  constraint  graph  is 


We  can  verify  easily  that  this  CSP  is  arc 
consistent,  but  we  cannot  say  whether  the  CSP  has 
no  solution  by  using  classical  arc  consistency 
algorithms. 


Applying  LAC  to  this  problem,  the  checking  of  Ci2  leads  to  label  the  domains  Di 
and  D2  by  1,  the  values  a  and  b  of  Di  and  D2  respectively  by  1,  and  the  values  b  and  a 
of  Di  and  D2  respectively  by  2.  In  the  same  way,  the  checking  of  the  constraint  C13 
leads  to  label  D3  by  1,  the  values  a  and  b  of  D3  by  respectively  2  and  1.  When,  the 
constraint  C23  is  checked,  since  D2  and  D3  have  the  same  label,  we  apply  the  label-arc 
consistency  concept  between  these  domains,  then  their  representative  domain  Dj  will 
be  empty  since  the  values  a  and  b  of  D2  and  D3  are  not  label-arc  consistent.  Thus,  we 
deduce  that  this  problem  has  no  solution. 

In  the  same  way  for  the  following  CSP  with  3  variables  and  3  functional  constraints, 

We  cannot  verify  with  the  classical  arc  consistency 
algorithms  whether  this  problem  has  a  solution. 
Since  after  applying  LAC,  all  the  domains  will  not 
be  empty,  all  solutions  of  this  problem  can  be 
computed  by  taking  for  each  of  them  the  values 
having  the  same  label. 


The  class  of  problems  CSPs  which  can  be  solved  using  LAC,  can  be  extended  to 
problems  containing  general  constraints. 


Definition  8  Let  (P)  be  a  CSP  and  G=(V,  U)  its  constraint  graph,  where  V  and  U 
are  respectively  the  set  of  vertices  and  edges  representing  the  variables  and  the 
constraints  of  (P).  And,  let  E  be  the  set  of  edges  representing  the  general  constraints  of 
(P).  The  partial  graph  Gb  of  G  is  defined  by  (V,  U-E),  (Gb  is  obtained  by  removing 
from  G  the  set  of  edges  E). 


Theorem  3  If  Gb  is  connected,  then  LAC  achieves  label-arc  consistency  on  (P)  by 
checking  each  functional  constraint  once  and  each  general  constraint  at  most  twice. 
And,  if  each  general  constraint  Cy  of  (P)  is  checked  when  Dj  and  Dj  have  a  same  label, 
then  LAC  guarantees  the  solving  of  (P). 
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Proof  I.AC  checks  the  constraints  of  (P)  progressively.  Then,  when  a  functional 
constraint  is  encountered,  we  make  the  same  reasoning  as  in  the  proof  of  theorem  2. 
When,  a  general  constraint  Qj  is  checked,  since  Gb  is  connected,  then  either  D]  and  Dj 
have  a  same  label  in  which  case  label-arc  consistency  concept  is  applied  to  this 
constraint,  and  for  each  tuple  (Vf,  Vs)  of  Rij,  the  values  Vj  and  Vs  will  have  the  same 
label.  By  property  2,  Cjj  will  be  an  implicit  functional  constraint,  and  this  constraint 
will  not  be  reconsidered.  Or,  the  domain  Dj  is  labelled  by  kl  and  Dj  by  k2  (kl?^k2), 
then  arc  consistency  concept  is  applied  to  this  constraint.  In  this  case,  the  constraint 
Qj  may  be  reconsidered  at  most  once,  since  when  it  is  reconsidered  we  are  sure  that  the 
domains  Dj  and  Dj  will  have  a  same  label  and  the  label-arc  consistency  will  be  applied 
to  it,  then  by  property  2  this  constraint  will  be  an  implicit  functional  constraint  and 
then  not  be  checked  again. 

If  each  general  constraint  Cjj  of  (P)  is  checked  when  Dj  and  Dj  have  a  same  label,  then 
for  each  tuple  (vj,  Vg)  of  Rij,  the  values  Vf  and  Vg  will  have  the  same  label.  And,  if 
after  applying  LAC,  the  domains  of  (P)  are  not  empty,  we  conclude  that  (P)  has  a 
solution,  and  all  solutions  of  (P)  can  be  found  by  taldng  for  each  of  them  the  values 
having  the  same  label.  ♦ 

Note  that  if  each  general  constraint  of  (P)  is  checked  twice,  then  LAC  guarantees  the 
solving  of  (P). 

Example  Let  us  consider  the  following  CSP  with  3  variables  and  3  constraints,  Ci2 
and  Ci3  are  functional,  while  C23  is  a  general  constraint. 

The  checking  of  Cl 2  leads  to  label  the  domains  Di 
y  by  the  values  a  and  b  of  Di  and  D2 

respectively  by  1,  and  the  values  b  and  a  of  Di  and 
D2  respectively  by  2.  In  the  same  way,  the 

_ y  checking  of  the  constraint  Cl 3  leads  to  label  D3  by 

1,  the  values  a  and  b  of  D3  by  respectively  1  and  2. 
When,  the  constraint  C23  is  checked,  since  D2  and  D3  have  the  same  label,  we  apply 
the  label-arc  consistency  concept  between  these  domains,  this  constraint  will  be  an 
implicit  functional  constraint.  We  can  not  verify  with  the  classical  arc  consistency 
algorithms  whether  this  problem  has  a  solution.  Since  after  applying  LAC,  all  the 
domains  will  not  be  empty,  all  solutions  of  this  problem  can  be  computed  by  taking 
for  each  of  them  the  values  having  the  same  label. 

Assume  that  Gb  is  not  connected,  and  let  GP  be  the  partial  graph  of  G  obtained  by 
adding  to  Gb  each  edge  representing  a  general  constraint  of  (P)  having  its  extremities 
in  a  same  connected  component  of  Gb-  The  following  corollary  points  out  the  general 
constraints  of  a  CSP  checked  at  most  twice. 

Corollary  The  general  constraints  corresponding  to  the  edges  added  to  Gb  are  all 
checked  at  most  twice. 

Proof  By  theorem  3,  the  general  constraints  of  each  connected  component  of  Gb  are 
checked  at  most  twice.  ♦ 

Note  The  LAC  method  can  be  used  as  a  decomposition  strategy  for  solving  CSPs 
containing  functional  constraints.  Namely,  if  we  want  to  solve  completely  the 
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problem,  we  can  use  the  LAC  method  as  a  preprocessing  for  filtering  and  parti tionning 
the  set  of  its  domains  into  subsets  Dy,...,  :  p  <  n  of  domains.  Each  partition  Djt 

has  a  representative  domain  Dfc  and  may  be  associated  to  a  subproblem  (Pk)  of  (P).  We 
instantiate  only  the  variables  of  the  representatives  domains  for  covering  all  the  search 
space  of  solutions,  since  each  value  of  each  representative  domain  Dk  participates  at 
most  in  one  solution  of  (Pk).  Thus,  the  theoretical  complexity  of  solving  such 

problems  is  in  (Xd^). 

5.  Computational  experiments 

We  have  compared  the  performances  of  LAC,  AC5  and  AC5+  over  CSPs  containing 
functional  and  IFC  constraints.  We  have  used  for  all  these  algorithms  the  notion  of 
first  support  introduced  in  AC6  [1]. 

The  experiments  were  performed  over  randomly  generated  problems  using  the  random 
model  proposed  in  [5], 

The  generator  considers  four  parameters 

•  n,  the  number  of  variables 

•  d,  the  domain  size  of  each  variable 

•  pc,  the  probability  that  a  constraint  Cy  between  two  variables  exists 

•  pu,  the  probability  that  a  given  tuple  (a,b)  belongs  to  Ry  The  comparisons  are 
concentrated  on  the  three  parameters  pc  e  {0.3,  0.6,  0.9},  pu  e  (0.4,  0.7}  and  nf  the 
number  of  functional  constraints  varying  between  4  and  40.  Each  algorithm  was 
applied  to  20  randomly  generated  problems,  for  each  tuple  of  values  of  pc,  pu  and  nf. 
All  test  problems  have  n=32  and  d=16.  For  each  instance  (pu,  pc,  and  nf  fixed),  and  for 
each  method  M  we  counted  the  ratio  rjvi  =  (the  running  time  of  M  /  number  of 
removed  values  by  M)  expressing  the  efficiency  of  the  method  M  (while  the  ratio 
decreases  the  efficiency  of  the  method  M  increases). 

LAC  versus  ACS 

Figures  la  to  Ic  show  the  comparison  between  LAC  and  AC5.  The  x-axis  represents 
the  number  of  functional  constraints  and  the  y-axis  the  ratio  rLAC  /  fAC5-  This  ratio 
express  the  relative  efficiency  between  LAC  and  AC5  (when  the  ratio  is  little  than  1, 
LAC  outperforms  AC5).  We  have  remarked  that  these  methods  have  almost  the  same 
running  time  for  problems  having  few  functional  constraints  (4<nf<8).  The  LAC 
method  becomes  faster  than  ACS  when  the  number  of  functional  constraints  increases. 
The  number  of  removed  values  by  LAC  is  always  greater  than  the  one  removed  by 
ACS. 

LAC  versus  AC5+ 

As  mentioned  above  the  ACS+  algorithm  performs  a  specific  treatment  only  for  IFC 
constraints,  while  our  method  deals  with  functional  constraints  including  the  IFC 
constraints.  For  this  reason,  we  have  considered  only  problems  containing  IFC  and 
general  constraints  for  evaluating  the  performances  of  these  methods  (All  functional 
constraints  of  the  test  problems  are  IFC  constraints).  Figures  2a  to  2c  show  the 
comparison  between  LAC  and  AC5+.  The  x-axis  represents  the  number  of  functional 
constraints  and  the  y-axis  the  ratio  tLAC  ^  rAC5+-  This  ratio  express  the  relative 
efficiency  between  LAC  and  AC5+.  We  remark  that  the  LAC  method  outperforms 
AC5+  on  all  tested  problems.  We  can  see  that  the  ratio  iLAC  /  rAC5+  decreases  with 
the  number  of  IFC  constraints,  this  means  that  the  efficiency  of  the  LAC  method 
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increases  with  the  number  of  functional  constraints.  This  may  be  explained  by  the  fact 
that  AC5+  performs  the  specific  treatment  on  the  most  part  of  the  IFC  constraints 
when  the  problems  contain  few  IFC  constraints  (4^f<12),  since  in  this  case  the  order 
of  the  checking  of  the  constraints  does  not  affect  the  number  of  IFC  constraints 
checked  only  once.  When  the  number  of  IFC  constraints  exceeds  1 2  the  LAC  method 
outperforms  largely  AC5+,  since  in  this  case  the  number  of  IFC  constraints  checked 
many  times  by  AC5+  increases. 


Fig.  lb.  LAC/AC5  (pc=0.6)  Fig.  2b.  LAC/AC5+  (pc=0,6) 
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The  computational  experiments  show  the  gain  brought  by  the  LAC  method.  The 
efficiency  of  the  LAC  method  is  justified  by  its  performances  (the  number  of  checks  of 
constraints  is  lower  than  the  one  checked  by  ACS  and  AC5+)  and  its  effectiveness  (the 
number  of  removed  values  is  greater  than  the  one  of  removed  by  ACS  and  ACS+). 

6.  Conclusion 

We  have  proposed  a  labelling  arc  consistency  method  LAC  for  filtering  CSPs 
containing  functional  constraints.  This  method  uses  two  consistency  concepts,  the 
classical  arc  consistency  and  the  label- arc  consistency.  The  performances  and  the 
effectiveness  of  LAC  on  CSPs  containing  functional  constraints  are  more  important 
than  the  classical  arc  consistency  algorithms,  since  on  the  one  hand,  this  method 
avoids  to  check  many  times  some  constraints,  and  on  the  other  hand,  LAC  allows  in 
some  cases  the  solving  of  the  CSP  or  the  improving  of  the  effectiveness  of  the 
filtering.  The  technique  used  in  LAC  makes  its  incremental  aspect  efficient  since  as 
seen  above  this  method  does  not  recheck  a  part  of  the  set  of  constraints.  The  main 
application  of  LAC  is  the  constraint  programming  languages.  The  empirical  results 
presented  show  the  substantial  gain  brought  by  the  LAC  method. 
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Abstract.  A  wavelength-routed,  optical  network  employs  all-optical  chan¬ 
nels  (lightpaths)  on  multiple  wavelengths  to  establish  a  rearrange  able  in¬ 
terconnection  pattern  (virtual  topology){oi  transport  of  data.  A  lightpath 
may  span  multiple  fiber-links,  and  may  be  routed  optically  at  an  inter¬ 
mediate  node  without  undergoing  electronic  conversion.  We  examine  the 
problem  of  establishing  a  set  of  lightpaths  in  an  optical  network,  which 
employs  a  passive  wavelength  routing  device  called  a  Latin  Router  (LR). 
Latin  Routers  are  attractive  for  optical  network  design  because  of  their 
fault- tolerance  and  low  cost,  but  make  traditional  routing  algorithms  dif¬ 
ficult  to  implement  due  to  the  complexity  of  the  constraints  they  impose 
on  legitimate  routes  and  colors.  We  employ  a  local  search  algorithm  to 
search  the  space  of  virtual  topologies  in  order  to  satisfy  a  maximum  num¬ 
ber  of  given  lightpath  requests.  We  use  the  same  algorithm  to  maximize 
the  number  of  single-hop  connections  for  a  given  network.  We  show  that 
the  algorithm  can  satisfy  a  high  percentage  of  lightpaths  under  low  to 
moderate  network  loads.  Experiments  reveal  that  we  can  establish  0{N) 
lightpaths  in  an  optical  network  with  N  nodes.  We  believe  that  our  work 
is  the  first  known  attempt  at  designing  optical  wide-area  networks  using 
Latin  Routers. 


Keywords:  wavelength  routing,  latin  routers,  local  search,  optical 
networks 


1  Introduction 

A  lightpath  is  used  in  a  wavelength-routed,  optical  network  to  establish  high¬ 
speed,  all-optical,  channels  which  can  span  multiple  fiber  links  without  undergo¬ 
ing  electronic  processing  at  the  intermediate  nodes  of  the  network.  For  example, 
in  Fig.  1,  optical  lightpath  LP4  is  established  directly  connecting  nodes  4  and 

2  through  an  all-optical  channel.  In  the  absence  of  wavelength  conversion  de¬ 
vices  at  the  intermediate  nodes  of  the  network,  a  lightpath  will  be  on  the  same 
wavelength  on  all  the  fiber-links  through  which  it  traverses;  the  lightpath  will 
be  switched  optically  at  the  intermediate  nodes,  e.g.,  lightpath  LP4  is  optically 
switched  at  node  5  and  node  1  before  it  finally  terminates  at  node  2.  A  lightpath 
is  typically  a  unidirectional  channel  of  communication,  i.e.,  a  lightpath  from 
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node  4  to  node  2  does  not  necessarily  mean  that  there  will  be  a  lightpath  from 
node  2  to  node  4. 


Underlying  Graph  G 


G(LP) 


Fig.  1.  From  G  to  G{VC). 


Using  all-optical  light  paths  in  the  network  architecture  considerably  reduces 
the  processing  time  at  intermediate  switching  nodes,  by  optically  switching  for¬ 
warded  traffic.  If  two  lightpaths  traverse  one  or  more  common  fiber  links,  the 
lightpaths  must  necessarily  be  operated  on  different  wavelengths.  For  example,  in 
figure  1  lightpaths  LPl  and  LP4  traverse  a  common  fiber  4-5,  and  hence  should 
be  on  different  wavelengths.  Typically  the  number  of  wavelengths  available  in 
the  network  is  fixed  at  some  maximum  number,  and  is  limited  by  the  technology 
used  to  build  the  network. 

A  routing  node  in  this  network  employs  an  optical  component  for  wavelength 
routing  of  all-optical  lightpaths.  We  use  a  Latin  Router  as  a  passive  wavelength 
router  in  the  optical  component  of  the  router,  because  of  its  low  cost  and  ro¬ 
bustness.  Pi.  K  X  K  Latin  Router  (LR)  (shown  in  Fig.  2)  provides  complete  con¬ 
nectivity  between  every  input  and  output  port,  by  passively  routing  optical 
connections  on  K  wavelengths.  A  certain  router,  called  the  Shift  Latin  Router 
(SLR),  has  a  fixed  cyclical-permutation-based  interconnection  pattern  between 
its  input  and  output  ports,  e.g.,  wavelength  k  on  input  port  i  is  always  routed  to 
the  same  wavelength  on  output  port  (i  H-  k)  modulo  K,  Vi,  A?  6  [0, 1, . . . ,  iV  —  1], 
A  particular  feature  of  the  Latin  Router  is  that  the  number  of  wavelengths  sup¬ 
ported  in  the  router  is  equal  to  the  size  of  the  Latin  Router. 

A  common  problem  in  optical  network  design  is:  given  a  physical  network 
topology  and  a  numbers  of  available  wavelengths  (hereafter  called  colors),  can 
we  establish  a  given  set  of  lightpaths?  Requested  lightpaths  are  given  as  source- 
destination  pairs  of  nodes  in  the  underlying  physical  topology.  Previously  pro- 
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Fig,  2.  Latin  Square  Router, 


posed  algorithms  typically  solve  this  problem  in  two  distinct  phases:  routing  and 
coloring.  The  routing  algorithm  uses  traditional  shortest-path-based  algorithms 
to  generate  a  set  of  routes  for  each  lightpath.  Each  lightpath  is  then  assigned  a 
color,  such  that  no  two  lightpaths  passing  through  a  common  link  are  assigned 
the  same  color.  Most  existing  network  designs  are  based  on  the  Wavelength  Rout¬ 
ing  Switch  (WRS),  which  is  a  reconfigurable  router  allowing  any  wavelength  to 
be  switched  from  any  input  port  to  any  output  port,  and  hence  does  not  impose 
any  restrictions  on  the  routing  algorithm  [BM95],  The  advantage  of  this  scheme 
is  that  routing  and  coloring  are  both  well-studied  problems,  and  therefore  a  large 
number  of  existing  techniques  can  be  employed  to  solve  the  problem. 

While  the  above  techniques  work  well  in  WRS-based  networks,  we  must  find 
different  solutions  for  LR-based  networks.  In  particular.  Shift  Latin  Routers 
make  it  difficult  to  separate  the  algorithmic  process  into  routing  and  coloring 
stages.  The  constraints  that  are  imposed  by  a  Shift  Latin  Router  on  the  wave¬ 
length  assigned  to  a  lightpath  needs  to  be  accommodated  by  the  routing  algo¬ 
rithm,  otherwise  unacceptably  small  numbers  of  lightpaths  will  have  colorings 
which  obey  those  constraints.  Modifying  the  routing  algorithms  to  minimize  the 
impact  of  those  constraints  is  difficult  and  substantially  complicates  the  rout¬ 
ing  process  (this  is  explained  later  in  the  Appendix).  Is  is  important  to  develop 
schemes  to  handle  the  constraints  that  these  routers  impose  on  optical  routing. 

An  alternative  approach  to  constraint  satisfaction  is  local  search  which  has 
proven  successful  at  rapidly  solving  a  variety  of  constraint  problems  [SLM92, 
MJPL92].  Local  search  makes  small  changes  to  a  complete  assignment  of  vari¬ 
ables  in  a  constraint  problem  in  order  to  improve  the  quality  of  the  solution.  We 
have  devised  an  algorithm  called  the  Local-search  Optical  Network  Configura¬ 
tion  Algorithm  (LONCA^)  which  routes  static  requests  for  an  optical  network 
built  using  Latin  Routers.  LONG  A  addresses  two  problems,  which  are  as  follows. 

-  Establish  the  maximum  number  of  a  set  of  requested  lightpaths  in  a  network. 
A  lonka  is  an  Indian  chili  pepper. 
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-  Establish  the  maximum  number  of  single-hop^  connections  in  a  network. 

LONCA  operates  by  performing  local  search  on  the  space  of  virtual  topologies 
(i.e.  possible  interconnections  of  routers)  to  find  such  topologies  which  satisfy 
the  maximum  number  of  lightpath  requests  or  which  establishes  the  maximum 
number  of  single-hop  in  an  optical  network. 

It  is  not  possible  to  set  up  a  complete  graph  as  a  virtual  topology  by  estab¬ 
lishing  lightpaths  to  provide  single-hop  connectivity  between  every  pair  of  nodes. 
In  this  case,  traffic  might  need  to  go  through  multiple  lightpaths,  while  under¬ 
going  electronic  switching  at  the  endpoints  of  adjacent  lightpaths.  Minimizing 
the  average  number  of  optical  hops  that  traffic  hcis  to  traverse  in  the  network  to 
reach  from  a  source  node  to  a  destination  node  is  a  related  optimization  problem. 
LONCA  does  not  handle  this  problem  directly,  but  we  investigate  the  virtual 
topologies  maximizing  the  number  of  single-hop  connections  to  see  how  many 
optical  hops  are  necessary  to  establish  all  connections  in  the  network. 

In  §2  we  provide  the  problem  formulation,  and  in  the  Appendix  we  describe 
the  constraints  imposed  by  the  physical  network  topology  and  the  routers  used 
on  the  lightpath  establishment  problem.  We  discuss  traditional  constraint  solving 
techniques,  which  separate  routing  and  coloring,  in  §3.  In  §4,  we  discuss  LONCA, 
which  is  based  on  the  well-known  local  search  paradigm.  We  present  simulation 
results  for  this  algorithm  on  randomly  generated  problem  instances  in  §5.  Our 
results  suggest  that  LONCA  can  establish  0{N)  lightpaths  in  a  network  on  N 
nodes.  This  can  help  in  drastically  reducing  the  average  hop  distance  in  the 
network.  In  §6  we  conclude  and  discuss  future  work. 

2  Routing  in  Optical  Networks 

We  discuss  the  problem  formulation,  and  briefly  mention  the  traditional  routing 
algorithms  used  to  satisfy  a  set  of  lightpath  requests.  In  the  Appendix,  we  discuss 
how  different  parameters  in  the  underlying  graph  and  the  chosen  router  can  affect 
the  constraint-satisfaction  problem. 

2.1  Problem  Formulation 

The  input  to  the  problem  is  a  graph  G  =<  V^E  >,  \V\  =  N,  a,  number  of 
available  wavelengths  k  and  a  set  of  i  requested  lightpaths  C  =  The 

number  of  wavelengths  available  restricts  the  number  of  lightpaths  which  can 
traverse  a  single  link,  Lightpaths  are  directed;  in  other  words,  we  may  desire 
to  have  a  directed  lightpath  from  Si  to  dj,  without  a  lightpath  from  dj  to  s*. 
Traditionally  we  employ  a  routing  algorithm  which  takes  the  list  C  =  {s,*,  d,*}  and 
produces  a  set  of  uncolored  routes  LP  =  {lpiijpi2->dpij}  such  that  Vi,  Ipn  = 
A^Pij  ~  Two  lightpaths  lpa,lpb  can’t  have  the  same  color  if  they  share  an 
edge  (i,  i)  G  G.  This  motivates  a  graph  coloring  problem  where  each  connection 

^  A  single-hop  connection  is  a  lightpath  connecting  two  nodes  of  the  network,  such 
that  they  can  communicate  with  each  other  in  one  optical  hop 
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is  isomorphic  to  a  node  of  an  auxiliary  graph;  two  nodes  have  an  edge  between 
them  in  the  auxiliary  graph  if  their  corresponding  lightpaths  share  an  edge  in  the 
underlying  graph  G.  We  shall  refer  to  this  auxiliary  graph  as  the  graph  induced 
by  LP  and  denote  it  as  G{LP).  Fig.  1  shows  an  example  of  constructing  G{LP) 
from  G.  We  observe,  for  example,  that  lpi,lp2  and  Ip^  all  use  edge  4  —  5  in  the 
graph,  and  so  they  form  a  3-clique  in  G{LP). 

Given  N  nodes  in  G,  G{LP)  has  \LP\  <  N{N  —  1)  nodes.  The  number  of 
edges  in  G{LP)  is  dependent  on  the  routing  of  the  lightpaths.  Intuitively,  shorter 
routes  for  lightpaths  result  in  the  fewer  edges  in  G{LP)j  because  fewer  lightpaths 
share  the  fiber  links.  The  network  topology,  and  the  size  of  the  router  used,  have 
also  been  shown  to  impact  lightpath  routing  [BH95]. 


LP2 

LP3 


Fig.  3.  Lightpaths  in  a  Multigraph. 


2.2  Physical  Network 

The  characteristics  of  the  network  topology  represented  by  G,  and  the  router 
used  at  the  network  nodes,  influences  the  solution  to  the  problem.  It  is  clear  that 
the  more  edges  G  has,  the  less  constrained  the  graph  coloring  problem  is  likely 
to  be,  because  there  will  be  fewer  lightpaths  sharing  a  fiber,  thus  decreasing  the 
number  of  edges  in  G{LP)  If  the  underlying  graph  is  a  muHigraph,  i.e.,  there  are 
multiple  fibers  connecting  adjacent  nodes  in  the  physical  topology,  then  more 
connections  can  pass  between  two  heavily  congested  nodes  without  an  increase 
in  the  number  of  colors  the  fiber  must  support.  In  Fig.  3  we  see  3  lightpaths 
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passing  from  edge  2  to  edge  5.  In  a  simple  graph  this  would  result  in  a  3-clique 
in  G{LP)\  however,  clever  selection  of  the  edges  the  lightpath  uses  can  result  in 
a  less  restrictive  constraint  graph.  We  discuss  the  impact  of  fiber  multiplicity  on 
lightpath  routing  in  §3. 

2.3  Optical  Wavelength  Routers 

Optical  networks  can  be  built  using  many  types  of  optical  routers.  The  different 
capabilities  in  these  routers  impose  different  constraints  on  legal  routes  and  have 
an  impact  on  the  constraint  problem  instance.  A  WRS  [BM95]  can  perform 
arbitrary  routing  of  wavelengths;  it  allows  multiple  wavelengths  to  be  optically 
switched  from  any  input  fiber  to  any  output  fiber,  as  long  as  multiple  lightpaths 
on  the  same  wavelength  don’t  need  to  be  switched  onto  the  same  output  fiber; 
however  WRSs  are  costly  to  build. 

The  routing  pattern  in  an  Arbitrary  Latin  Router  (ALR)  is  based  on  the 
Latin  Square;  for  a  router  of  size  K  the  routing  pattern  consists  of  a  AT  x  A' 
table.  Each  fiber  attached  to  a  node  in  G  is  attached  to  a  single  row  (input  port) 
and  a  single  column  (output  port)  of  the  table,  and  table  entry  (f,  j)  corresponds 
to  the  wavelength  switched  from  input  port  i  to  output  port  j  as  shown  in  figure 
2.  The  routing  table  has  the  property  that  no  color  appears  twice  in  the  same 
row  or  in  the  same  column  of  the  table,  and  only  one  color  occupies  each  entry 
of  the  table.  These  two  properties  ensure  that  there  is  only  a  single  wavelength 
which  is  switched  from  any  input  port  to  any  output  port,  and  that  lightpaths 
switched  onto  an  output  port  are  distinct  colors.  Rows  and  columns  which  have 
no  edge  attached  to  them  may  be  used  as  access  ports,  and  are  used  to  terminate 
lightpaths  originating  or  ending  at  the  node.  Access  ports  are  labeled  with  an 
S  in  figure  2.  We  do  not  know  of  any  existing  prototypes  of  Arbitrary  Latin 
Routers. 

A  more  restricted  form  of  Latin  Routers  is  called  a  Shift  Latin  Router  (SLR). 
This  router  has  the  additional  property  that  the  wavelengths  appear  in  increasing 
order  in  the  top  row  of  the  table,  and  each  subsequent  row  is  the  previous  row 
rotated  by  one  table  cell;  formally,  wavelength  k  on  input  port  i  is  always  routed 
to  the  same  wavelength  on  output  port  {i-\-k)  modulo  K,  Vi,  A?  G  [0, 1, . . . ,  A/"  —  1]. 
Such  a  router  appears  in  Fig.  2.  We  have  used  letters  to  denote  wavelengths  in 
order  to  avoid  confusion  with  the  labels  for  the  incoming  edges.  In  this  figure, 
we  observe  that  is  routed  from  input  edge  three  to  output  edge  one,  and  is 
assigned  wavelength  E. 

Latin  Routers  impose  a  variety  of  different  constraints  on  both  routing  and 
coloring.  Traditional  routing  techniques  can  handle  some  of  the  constraints  for 
ALRs  [CB95],  but  SLRs  add  immense  complexity  to  the  problem,  requiring  in¬ 
tricate  routing  algorithms  and  the  generation  of  polynomially  many  constraints 
even  before  coloring  begins.  This  is  because  there  are  0{NK^)  coloring  con¬ 
straints  {N  being  the  number  of  nodes,  and  K  x  K  being  the  size  of  each  router) 
which  are  imposed  by  the  static  routing  property  of  the  Latin  Router,  in  ad¬ 
dition  to  the  coloring  constraints  imposed  by  two  lightpaths  sharing  the  same 
fiber.  The  reader  interested  in  more  details  is  referred  to  the  Appendix. 
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Fig.  4.  Interconnection  using  Latin  Squares. 


In  figure  4  we  show  an  example  of  a  3  node  network  composed  of  3x3  Latin 
Routers.  Each  undirected  edge  of  the  graph  is  represented  by  2  directed  edges 
between  the  routers.  As  before,  an  S  by  the  input  port  (row)  denotes  the  port  at 
which  connections  originate  at  the  node,  while  an  S  at  the  output  port  (column) 
denotes  connections  terminating  at  the  node.  Let  us  suppose  we  want  a  lightpath 
established  between  nodes  1  and  2.  Input  port  3  of  node  1  is  the  designated  port 
where  connections  originate,  and  the  edge  from  output  port  2  runs  from  node 
1  to  node  2.  This  edge  enters  port  1  of  node  2,  and  output  port  3  of  node  2 
terminates  this  lightpath.  In  order  for  this  to  be  a  valid  lightpath,  table  entry 
(3,2)  of  node  1  and  table  entry  (1,3)  of  node  2  must  have  the  same  wavelength; 
if  we  look  at  the  circled  table  entries  we  see  that  both  entries  have  wavelength 
C.  So  this  edge  denotes  a  lightpath.  If  we  want  to  establish  a  lightpath  from  3 
to  1,  however,  we  can’t  take  the  edge  from  3  to  1  since  entry  (3,1)  of  node  3  is  B 
and  entry  (1,3)  of  node  1  is  C.  In  fact,  we  can’t  establish  a  lightpath  from  node 
3  to  node  1  in  this  configuration  at  all. 


3  Previous  Work 

We  briefly  discuss  previous  work  on  algorithms  designed  to  route  and  color  light- 
paths  on  optical  networks.  In  most  proposed  algorithms,  routing  and  coloring  are 
treated  as  separate  phases  of  the  algorithm  [BM95,  RS94,  ZA94].  We  first  dis¬ 
cuss  algorithms  for  Wavelength  Routing  Switches,  and  then  for  Arbitrary  Latin 
Routers. 
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3.1  Lightpath  establishment  for  WRS-based  Networks 

[BM95]  present  an  analysis  of  routing  and  coloring  techniques  for  establishing 
light  paths  using  a  WRS.  Since  the  WRS  imposes  no  additional  constraints  on 
coloring,  the  routing  algorithm  should  provide  a  graph  coloring  problem  instance 
which  is  as  easy  to  solve  as  possible.  One  approach  is  to  find  a  set  of  p  short  routes 
for  each  lightpath  and  pick  the  route  which  adds  the  fewest  edges  to  G{LP).  A 
modification  of  Dijkstra’s  algorithm  can  be  used  to  compute  the  p-shortest  paths 
for  each  connection. 

Once  G{LP)  is  created,  any  coloring  algorithm  can  be  used  to  color  the 
resulting  graph.  On-line  coloring  algorithms  have  been  used  [BM95],  although 
other  fast  algorithms  such  as  min-conflicts  can  also  be  used  [MJPL92].  An  inter¬ 
esting  special  case  of  the  problem  occurs  when  the  underlying  network  topology 
is  a  single  cycle.  Clearly  the  shortest  path  in  such  a  network  is  an  arc  of  length 
less  than  or  equal  to  N/2.  We  see  that  the  resulting  constraint  graph  is  a  circu¬ 
lar  arc  graph.  While  coloring  such  graphs  is  still  known  to  be  A/^'P-Hard,  [Tuc75] 
gives  an  upper  bound  on  the  number  of  colors  required  by  such  graphs  and  a 
multi-commodity  flow  algorithm  which  solves  the  problem;  [MIR93]  gives  on¬ 
line  algorithms  which  approximate  the  number  of  required  colors  to  within  a 
constant  factor. 


3.2  Lightpath  Establishment  for  ALR-based  Networks 

[CB95]  present  a  discussion  of  routing  in  networks  using  Arbitrary  Latin  Routers. 
Their  approach  is  to  use  p-shortest  path  as  described  above  to  find  available 
routes  for  each  connection.  The  algorithm  selects  the  route  which  minimally 
constrains  routers,  and  then  assigns  a  wavelength  which  minimally  constraints 
those  routers.  This  algorithm  was  developed  for  ALRs  and  does  not  address  the 
question  of  routing  in  networks  composed  of  SLRs.  To  our  knowledge  no  previous 
work  has  devised  an  optical  routing  algorithm  for  networks  using  SLRs. 

4  Satisfying  Connections  Using  Local  Search 

Traditional  constraint  satisfaction  techniques  use  routing  to  generate  an  easy 
coloring  problem,  then  satisfy  the  coloring  constraints.  There  are  polynomially 
many  coloring  constraints  related  to  the  configuration  of  the  SLRs,  and  it  is 
difficult  to  write  routing  algorithms  which  result  in  easy  coloring  problems  due 
to  the  intricate  nature  of  these  constraints. 

Local  search  has  been  successful  in  finding  satisfying  assignments  for  solvable 
K-SAT  problems  [SLM92]  and  graph  coloring  problems  [MJPL92].  While  it  does 
not  guarantee  a  solution,  in  practice  local  search  algorithms  tend  to  solve  con¬ 
straint  problems  with  solutions  very  quickly.  Local  search  algorithms  examine 
small  changes  to  a  complete  assignment  of  variables  in  a  constraint  problem  and 
select  one  of  the  changes  which  improves  the  number  of  satisfied  constraints  the 
most.  This  procedure  is  sometimes  referred  to  as  gradient  ascent  or  hill- climbing. 
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Since  complete  search  of  all  virtual  topologies  appears  to  be  costly,  we  felt  that 
local  search  was  an  attractive  alternative. 

We  decided  to  use  local  search  to  find  a  virtual  topology  which  would  satisfy 
the  most  lightpath  requests;  in  effect,  we  merge  the  routing  and  coloring  stages 
into  the  same  algorithm.  To  do  so,  we  make  small  changes  in  the  virtual  topology 
by  changing  the  connections  between  edges  in  the  graph  and  ports  on  the  SLRs, 
thereby  changing  the  mapping  of  colors  onto  edges.  For  instance,  in  figure  4,  if 
we  wanted  a  lightpath  between  nodes  3  and  1  and  the  nodes  were  connected  in 
the  fashion  indicated,  we  would  be  unable  to  satisfy  this  request.  However,  if  we 
were  to  change  the  edges  assigned  to  the  output  ports  of  node  3  so  that  the  edge 
from  node  3  to  node  1  was  connected  to  output  port  2,  then  the  wavelength  at 
entry  (3,2)  of  node  3  and  the  wavelength  at  entry  (1,3)  of  table  1  (corresponding 
to  the  termination  of  a  connection)  are  both  C,  indicating  that  this  is  now  a 
valid  lightpath.  To  make  this  change,  we  would  switch  the  edge  running  from 
node  3  to  node  2  to  output  port  1  on  node  3.  Swapping  pairs  of  input  port  or 
output  port  locations  defines  a  natural  neighborhood  for  local  search. 

We  present  the  Local-Search  Optical  Network  Configuration  Algorithm  (LONCA) 
in  figure  5.  Making  a  change  to  a  single  router  is  simply  a  matter  of  swapping 
the  position  of  2  of  the  edges  attached  to  either  input  or  output  ports  of  a  Latin 
Router.  If  the  router  size  is  K  then  there  are  at  most  -  K  row  swaps  and 
column  swaps  possible  for  each  router  in  G.  These  local  changes  to  the  network 
configuration  allow  us  to  move  through  the  space  of  all  virtual  topologies.  At 
each  iteration  of  the  local  search  algorithm,  we  select  the  edge  swap  which  in¬ 
creases  the  number  of  satisfied  lightpath  requests  the  most;  if  there  are  several 
such  changes  we  choose  any  one  among  them  at  random.  For  a  single  iteration, 
we  examine  swaps  in  only  one  router  due  to  the  cost  of  analyzing  the  updates 
and  the  large  number  of  swaps  which  must  be  examined. 


procedure  LONCA(G,  V C ,RouterSize,MaxSwaps) 
connect  all  routers  in  G  at  random 
for  i=l  to  MaxSwaps 
pick  a  router  at  random 
for  each  row  and  column  swap 
evaluate  the  number  of  connections  satisfied 
update  set  of  best  swaps 
end  for 

pick  one  of  the  best  swaps  and  do  it 
if  all  connections  satisfied  exit 
end  for 
end 


Fig.  5.  LONCA  Algorithm  Sketch. 


A  related  problem  to  that  of  satisfying  requested  connections  is  that  of  max- 


40 


imizing  the  number  of  connections  established  in  the  network.  In  some  cases 
network  designers  may  not  have  a  clear  set  of  requests  in  mind,  and  may  try 
to  maximize  the  number  of  lightpath  that  can  be  established  in  the  network, 
in  order  to  minimize  the  average  hop  distance  in  the  network.  We  use  LONCA 
with  a  complete  graph  as  the  requested  virtual  topology  to  solve  this  problem. 

5  Empirical  Results 

We  tested  LONCA  on  physical  networks  of  different  sizes  and  with  differing 
numbers  of  requested  connections.  In  each  case  we  generated  physical  networks 
in  the  following  way:  we  required  each  network  to  be  a  biconnected  graph  with 
average  degree  varying  uniformly  between  two  and  seven,  hence  the  number  of 
edges  in  the  networks  were  4.5iV.  These  assumptions  are  based  on  characteris¬ 
tics  of  present-day  fiber  networks.  Lightpath  requests  were  generated  by  selecting 
source-destination  pairs  of  nodes  chosen  uniformly  at  random  without  replace¬ 
ment.  For  these  experiments  we  assume  that  all  routers  are  of  the  same  size  K 
and  that  the  multiplicity  m  of  each  edge  in  the  graph  is  the  same. 


5.1  Satisfying  Requested  Lightpaths 

Our  first  set  of  experiments  was  designed  to  analyze  LONG  As  ability  to  satis¬ 
fying  a  set  of  requested  connections.  We  generated  100  sets  of  connections  for 
each  of  5  different  physical  topologies,  each  consisting  of  50  nodes.  We  analyzed 
networks  with  two  different  configurations:  m  =  1,A  =  8  and  m  =  2,  K  =  15. 
We  chose  the  router  size  such  that  a  node  in  the  graph  with  maximum  degree 
would  be  guaranteed  at  least  one  access  port  for  termination  of  lightpaths;  hence 
X  —  I .  Moreover,  Latin  Router  prototypes  of  sizes  8  and  15  have  been 

reported  in  the  literature  [DEK91].  We  ran  LONCA  once  per  set  of  connections 
with  MaxSwaps  set  to  500  since  our  initial  tests  indicated  that  after  500  swaps 
LONCA  was  unable  to  improve  the  number  of  requests  satisfied.  We  generated 
numbers  of  connections  ranging  from  2N  -  10  A  and  computed  the  proportion  of 
connections  satisfied.  We  present  the  results  in  Fig.  6.  Each  line  of  the  plot  indi¬ 
cates  the  performance  of  differing  numbers  of  connections  for  the  same  physical 
network. 

We  observe  that  the  proportion  of  established  connections  is  higher  for  a 
fiber  multiplicity  of  two  than  for  a  multiplicity  of  one.  We  observe,  as  expected, 
that  in  both  cases  the  percentage  of  lightpaths  satisfied  decreases  as  the  number 
of  requests  increases.  However,  the  degradation  in  number  of  requests  satisfied 
is  relatively  slow,  with  over  70%  of  500  requests  satisfied  for  a  router  size  of  15 
and  a  fiber  multiplicity  of  two. 

There  are  likely  to  be  connections  which  we  cannot  be  established  due  to 
violations  of  the  Arbitrary  Latin  Router  constraint.  For  example,  if  10  requested 
lightpaths  originate  at  a  node  with  a  physical  degree  of  seven  and  a  single  source 
port,  we  can  only  set  up  seven  of  these  connections.  We  were  able  to  analyze 
those  lightpath  requests  which  were  not  satisfied  with  respect  to  the  number 
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Fig.  6.  Mean  number  of  requested  connections  established. 


of  light  path  requests  for  the  source  and  destination  nodes.  We  found  that  for 
m  =  1,  only  a  small  fraction  of  lightpath  request  failures  were  due  to  source  or 
destination  port  overload,  and  that  for  m  =  2  no  lightpath  request  failures  were 
caused  by  this  problem. 

5.2  Maximizing  Single- Hop  Connections 

In  our  next  experiment  we  ran  LONCA  on  100  different  networks  of  sizes  50-100 
incremented  by  10.  Our  objective  was  to  maximize  the  number  of  single-hop 
connections  established  in  the  given  networks;  in  these  experiments  we  asked 
LONCA  to  attempt  to  establish  all  N[N  —  1)  directed  connections  possible  in 
the  network.  We  ran  LONCA  10  times  per  network  configuration  (to  average  out 
the  randomizing  effect  of  the  local  search  algorithm)  again  using  two  network 
configurations:  m  =  1,  AT  =  8  and  m  =  2,  AT  =  15  with  MaxSwaps  set  at  500. 
Fig.  7  shows  the  scaling  in  the  number  of  connections  we  were  able  to  establish. 
We  see  that  LONCA  is  able  to  set  up  about  11. 4 A  connections  on  N  nodes 
when  m  =  1  and  22. SN  connections  when  m  =  2;  we  conjecture  that  LONCA 
can  establish  about  11. 4mN  connections  in  a  network  with  multiplicity  m,  but 
we  need  to  run  more  experiments  to  verify  this  claim. 

We  examined  the  resulting  networks  to  determine  how  many  nodes  could 
be  reached  in  one  or  two  optical  hops.  We  report  on  the  results  obtained  for 
100  node  networks;  again  we  tested  the  case  for  100  different  topologies  with 
m  =  Ij  K  =  S  and  m  =  2,  K  =  15.  When  m  =  1  LONCA  was  able  to  establish 
an  average  of  74%  of  all  —  N  connections  in  one  or  two  optical  hops.  If  m 


42 


Fig.  7.  Mean  number  of  single-hop  connections  established. 


increases  to  two,  LONG  A  can  establish  an  average  of  99%  of  the  connections 
(Fig.  8). 

6  Conclusions  and  Future  Work 

We  have  presented  and  analyzed  LONGA,  a  local  search  algorithm  designed 
to  perform  routing  of  static  lightpaths  in  all-optical  networks.  We  show  that 
LONGA  is  able  to  satisfy  a  high  proportion  of  lightpath  requests  effectively  in 
networks  with  high  edge  multiplicity,  for  a  large  number  of  requested  connec¬ 
tions.  We  also  showed  that  LONGA  can  establish  0{N)  lightpaths  in  a  network 
of  size  AT,  and  can  effectively  connect  every  node  in  a  100  node  network  with 
nearly  every  other  node  simultaneously  in  one  or  two  optical  hops,  using  a  router 
size  of  15  X  15,  and  fiber  multiplicity  of  two. 


m 

Mean 

Sdev 

1 

74.07% 

1.43% 

2 

99.28% 

0.23% 

Fig.  8,  Percentage  of  Connections  Established  in  2  Hops. 


We  were  disappointed  that  we  could  not  effectively  analyze  the  reason  for 
LONGA’s  inability  to  establish  lightpaths.  Glearly,  understanding  the  problem 
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in  establishing  lightpaths  will  help  us  create  more  effective  algorithms  for  estab¬ 
lishing  lightpaths  in  the  future.  We  plan  to  study  theoretical  upper  and  lower 
bounds  on  the  number  of  lightpaths  that  can  be  established  in  networks  of  Shift 
Latin  Routers, 

There  are  many  variants  of  local  search  algorithms  which  improve  perfor¬ 
mance.  Most  of  these  variants  force  more  vigorous  exploration  of  the  assignment 
space  by  promoting  changes  to  the  assignment  involving  frequently  unsatisfied 
constraints  or  frequently  ignored  variables.  We  hope  to  investigate  the  impact 
of  these  improvements  on  LONCA’s  performance. 

We  have  only  tested  physical  networks  of  one  type  in  our  experiments.  We 
hope  to  continue  experimenting  with  both  more  sparse  and  more  dense  networks 
to  analyze  LONCA’s  ability  to  satisfy  constraints.  We  have  suggested  that  when 
a  network  is  a  Hamiltonian  Cycle  that  we  may  be  able  to  devise  better  routing 
algorithms  and  coloring  algorithms.  Other  special  case  networks  such  as  regular 
graphs  are  also  worth  examining. 

A  major  drawback  to  our  experiments  is  the  selection  of  sets  of  requests 
to  examine.  In  the  analysis  section  we  mention  that,  even  before  routing,  we 
can  guarantee  some  connections  will  not  be  established  due  to  excessive  load  at 
either  the  source  or  the  destination.  We  therefore  wish  to  examine  a  somewhat 
different  problem;  given  a  model  of  generating  requests  for  lightpaths,  how  do 
we  build  inexpensive  networks  using  SLRs  which  perform  within  some  specified 
tolerance? 
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A  Appendix:  The  Constrainedness  of  Latin  Routers 

In  this  section  we  discuss  the  constraints  Latin  Routers  impose  upon  constraint 
satisfaction  in  more  detail.  One  of  the  features  of  a  Latin  Router  is  that  the 
number  of  wavelengths  supported  by  Latin  Router  is  equal  to  the  size  of  the 
Latin  Router.  It  is  not  known  how  to  interconnect  a  network  of  nodes  employing 
Latin  Routers  of  different  sizes  (as  this  would  bring  a  mismatch  in  the  number  of 
wavelengths  supported  in  different  parts  of  the  network).  Therefore,  we  assume 
that  the  sizes  of  the  Latin  Router  used  is  the  same  for  every  node  in  the  network. 
The  router  size  K  must  be  larger  than  the  maximum  physical  degree  in  the 
physical  topology,  i.e.,  K  >  A.  In  addition,  we  notice  that  if  the  degree  of  a 
node  is  much  less  than  K,  there  may  be  a  large  number  of  free  ports  in  the 
node.  Also,  since  the  number  of  colors  supported  is  equal  to  the  dimension  of 
the  router,  we  must  increase  the  router  dimension  if  we  need  more  colors  to  color 
the  constraint  graph,  subject  to  limitations  imposed  by  the  maximum  router  size. 
Finally,  since  one  table  entry  switches  an  incoming  connection  on  one  edge  out 
to  another  edge  and  each  table  entry  only  switches  one  color,  we  know  that  a 
demand  to  switch  two  colors  from  one  incoming  edge  to  the  same  outgoing  edge 
can’t  be  satisfied.  This  imposes  restrictions  on  the  constraint  graph  which  are 
known  at  routing  time;  namely  two  or  more  light  paths  can’t  share  two  adjacent 
edges  in  G.  Fig.  3  shows  this  situation:  we  cannot  route  two  virtual  circuits  from 
edge  2  to  edge  5  using  a  Latin  Router.  This  restriction  also  has  an  impact  on 
the  number  of  virtual  circuits  originating  and  terminating  at  a  node;  if  there  are 
s  source  ports  and  the  degree  of  the  node  in  the  physical  network  is  d  then  only 
sd  connections  can  originate  or  terminate  at  the  node.  We  refer  to  this  as  the 
bypass  restriction  in  an  LR-based  network. 

A.l  Routing  for  Arbitrary  Latin  Routers 

We  notice  that  the  only  additional  constraint  that  an  Arbitrary  Latin  Router 
imposes  on  establishing  connections  is  due  to  bypass  restrictions;  there  are  no 
additional  coloring  constraints.  This  implies  that  if  we  can  manage  to  route  with¬ 
out  violating  the  bypass  restrictions,  we  can  utilize  existing  coloring  algorithms 
to  assign  wavelengths  to  the  lightpaths. 

This  problem  is  alleviated  by  increasing  the  size  of  the  router  and  by  in¬ 
creasing  the  fiber  multiplicity  of  edges  in  the  network;  two  routes  can  now  be 
established  on  different  fibers  but  routing  connections  to  the  same  node  of  the 
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network.  Fig.  3  shows  how  increasing  the  multiplicity  alleviates  the  problem  in 
Latin  Routers.  If  the  multiplicity  is  ^  we  see  that  we  can  route  /i^  connections 
sharing  two  adjacent  edges;  the  ease  in  restrictions  extends  to  routes  to  and  from 
the  source  ports  as  well.  With  reference  to  the  special  case  when  the  underlying 
network  is  a  single  Hamiltonian  Cycle  we  observe  that  if  Latin  Routers  are  to 
be  used,  we  must  increase  the  multiplicity  of  edges  in  the  physical  network.  We 
observe  if  the  maximum  clique  in  G{LP)  has  size  L  then  the  multiplicity  of  edges 
must  be  >  \/L,  resulting  in  router  sizes  of  at  least  Ay/Z. 

A. 2  Routing  for  Shift  Latin  Routers 

The  Shift  Latin  Router  is  a  more  restricted  form  of  Latin  Routers.  We  can 
characterize  the  form  this  router  must  take  if  we  label  the  available  wavelengths 
as  integers  from  1  to  K.  If  we  examine  any  four  table  entries  (i,a),(i,b),  (j,a)  and 
(j,b)  and  refer  to  the  color  of  entry  (x,y)  as  c^y  then  c,‘o-Ci5+Cjj-Cja  mod  =  0. 
The  number  of  Latin  Squares  with  this  property  is  many  times  fewer  than  the 
number  of  arbitrary  Latin  Squares  [CPS90].  This  restriction  turns  out  to  be  quite 
difficult  to  accommodate  in  lightpath  establishment.  Routing  and  Wavelength 
Assignment  algorithms  determine  the  sequence  of  edges  used  by  lightpaths  and 
then  assign  these  paths  non- conflicting  colors.  However,  in  this  case  we  need  to 
make  certain  that  all  the  color  entries  obey  the  above  restrictions;  in  addition 
to  encoding  the  normal  constraints  we  must  also  generate  and  accommodate 
0{NK^)  additional  constraints  imposed  by  the  special  structure  of  the  Shift 
Latin  Router.  To  see  that  this  is  the  correct  number  of  constraints,  notice  we 
select  2  entries  from  each  of  2  rows:  the  total  number  is  K{K  -  if.  Since 
there  are  N  routers  we  have  a  total  of  0(AA:^).  The  heuristics  we  designed  to 
account  for  all  of  these  intricacies  were  highly  complicated  and  expensive  to  run. 
Further,  these  constraints  are  highly  restrictive  on  the  space  of  solutions:  of  the 
colorings  on  4  variables,  a  Latin  Router  constraint  on  these  4  variables  leaves 
only  k{k  -  1)^  colorings  remaining,  which  is  incredibly  restrictive  compared  to 
the  edge  constraints  on  2  variables,  each  of  which  leave  colorings. 
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Abstract.  While  CNF  propositional  satisfiability  (SAT)  is  a  sub-class  of  the 
more  general  constraint  satisfaction  problem  (CSP),  conventional  wisdom  has  it 
that  some  well-known  CSP  look-back  techniques  -  including  backjumping  and 
learning  -  are  of  little  use  for  SAT.  We  enhance  the  Tableau  SAT  algorithm  of 
Crawford  and  Auton  with  look-back  techniques  and  evaluate  its  performance  on 
problems  specifically  designed  to  challenge  it. 

The  Random  3-SAT  problem  space  has  commonly  been  used  to  benchmark 
SAT  algorithms  because  consistently  difficult  instances  can  be  found  near  a 
region  known  as  the  phase  transition.  We  modify  Random  3-SAT  in  two  ways 
which  make  instances  even  harder.  First,  we  evaluate  problems  with  stmctural 
regularities  and  find  that  CSP  look-back  techniques  offer  little  advantage. 
Second,  we  evaluate  problems  in  which  a  hard  unsatisfiable  instance  of  medium 
size  is  embedded  in  a  larger  instance,  and  we  find  the  look-back  enhancements 
to  be  indispensable.  Without  them,  most  instances  are  “exceptionally  hard”  — 
orders  of  magnitude  harder  than  typical  Random  3-SAT  instances  with  the  same 
surface  characteristics. 


1  Introduction 

Given  the  usual  framework  of  backtrack  search  for  systematic  solution  of  the  finite- 
domained  constraint  satisfaction  problem  (CSP),  techniques  intended  to  improve  effi¬ 
ciency  can  be  divided  into  two  classes:  look-ahead  techniques,  which  exploit  informa¬ 
tion  about  the  remaining  search  space,  and  look-back  techniques,  which  exploit 
information  about  search  which  has  already  taken  place.  The  former  class  includes 
variable  ordering  heuristics,  value  ordering  heuristics,  and  dynamic  consistency 
enforcement  schemes  such  as  forward  checking.  The  latter  class  includes  schemes  for 
backjumping  (also  known  as  intelligent  backtracking)  and  learning  (also  known  as 
nogood  or  constraint  recording).  In  CSP  algorithms,  techniques  from  both  classes  are 
popular;  for  instance,  one  common  combination  of  techniques  (e.g.  [1,  12,  28])  is  for¬ 
ward  checking,  conflict-directed  backjumping  [23],  and  an  ordering  heuristic  prefer¬ 
ring  variables  with  the  smallest  domains. 

CNF  propositional  satisfiability  (SAT)  is  a  specific  kind  of  CSP  in  which  every 
variable  ranges  over  the  values  {true,  false}  .  For  SAT,  the  most  popular  systematic 
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algorithms  are  variants  of  the  Davis-Logemann-Loveland  modification  [8]  to  the  pro¬ 
cedure  originally  defined  by  Davis  and  Putnam  [7];  hereafter  we  refer  to  this  procedure 
as  “DP”.  In  CSP  terms,  the  procedure  is  equivalent  to  backtrack  search  with  forward 
checking  and  an  ordering  heuristic  favoring  unit-domained  variables.  Two  effective 
modem  variants  of  this  algorithm  are  Tableau  [5]  and  POSIT  [11],  both  amounting  to 
DP  with  highly  optimized  variable  ordering  heuristics.  Are  these  SAT  algorithms  miss¬ 
ing  anything  by  not  incorporating  conflict-directed  backjumping  or  another  look-back 
technique?  The  standard  Random  3-SAT  problem  space  commonly  used  to  benchmark 
SAT  algorithms  may  not  be  a  good  place  to  look  for  the  answer:  Tableau  is  able  to 
solve  millions  of  instances  from  Random  3-SAT  without  any  apparent  trouble. 

Here  we  challenge  Tableau  with  modifications  to  Random  3-SAT  to  make  instances 
more  difficult.  Random  3-SAT  is  already  a  source  of  consistently  hard  problem 
instances  --  those  in  the  region  of  the  phase  transition  occurring  when  the  ratio  of  con¬ 
straints  to  variables  increases  through  a  critical  value  [27].  The  phase  transition  sepa¬ 
rates  an  under-constrained  region,  where  almost  all  instances  are  satisfiable  and  easy, 
from  an  over-constrained  region,  where  almost  all  instances  are  unsatisfiable  and  rela¬ 
tively  easy.  We  modify  Random  3-SAT  in  two  ways:  first,  we  force  problems  to  have 
structural  regularities  intended  to  confuse  variable  selection  heuristics;  second,  we 
embed  hard  unsatisfiable  instances  into  larger  instances,  making  the  unsatisfiability  of 
the  resulting  instances  difficult  to  identify.  We  also  modify  Tableau  to  incorporate 
some  popular  look-back  techniques,  and  we  evaluate  the  enhanced  algorithm  with  the 
new  problem  spaces.  In  the  case  of  highly  regular  problems,  we  find  that  look-back 
techniques  offer  little  or  no  advantage;  for  solving  our  embedded  problems,  we  find 
them  indispensable. 

Researchers  working  with  random  spaces  for  other  CSPs  [1,  17],  with  other  SAT 
problem  spaces  [14,  15],  or  with  Random  3-SAT  but  an  algorithm  other  than  Tableau 
[14,  15],  have  found  rare  instances  in  the  under-constrained  region  so  difficult  as  to 
render  the  mean  difficulty  higher  there  than  in  the  transition  region.  Crawford  and 
Auton  [5]  using  Tableau  and  Random  3-SAT  find  no  such  “exceptionally  hard” 
instances  (EHIs).  Compared  to  a  reference  problem  space  harder  than  Random  3-SAT 
(“Variable  Regular  3-SAT”  -  see  Section  4),  our  embedding  procedure  generates 
instances  that  are  “exceptionally  hard”  for  Tableau  -  orders  of  magnitude  harder  than 
other  instances  with  the  same  surface  characteristics.  Our  EHIs  could  as  well  result 
from  Random  3-SAT,  albeit  with  low  probability.  These  instances  have  a  clause  to 
variable  ratio  that  places  them  in  the  under-constrained  region  of  the  reference  problem 
space.  Given  the  difficulty  and  consistency  with  which  they  are  generated,  we  believe 
they  are  useful  as  benchmarks  for  SAT  algorithms.^ 

We  find  that  look-back  techniques  greatly  reduce  the  incidence  of  EHIs  produced 
by  our  embedding  procedure.  The  result  is  similar  to  that  of  Baker  [1]  who,  using  a 
random  graph-coloring  CSP  space,  finds  no  EHIs  with  respect  to  a  conflict-directed 
backjumping  and  learning  algorithm.  In  contrast,  some  instances  produced  by  our 
embedding  procedure  remain  difficult  even  for  Tableau  with  conflict-directed  back- 


1  Implementations  of  the  problem  generators  and  algorithms  defined  in  this  paper  are  available 
through  the  Web  page  of  the  first  author:  http://www.cs.utexas.edu/users/bayardo/. 
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jumping  and  learning  enhancements.  Related  work  has  identified  other  3-SAT 
instances  that  are  difficult  for  Tableau  or  other  DP  variants  [16,  22].  The  instances 
among  these  which  we  have  tested  are  trivial  for  Tableau  enhanced  with  CSP  look- 
back  techniques  (see  Section  6). 

2  Definitions 

SAT  involves  determining  whether  a  given  Boolean  expression  has  a  satisfying 
truth  assignment.  Any  Boolean  expression  can  be  transformed  to  conjunctive-normal 
form  (CNF)  which  allows  a  conjunction  of  clauses  Cj  a  C2  a  . . .  a  where  each 
clause  C.  is  a  disjunction  of  literals  v  v  ...  v  /. .  A  literal  is  either  a  variable 
or  its  negation  —a- ,  !</<«.  Expressions  in  conjunctive  normal  form  are  easily  seen 
to  be  instances  of  the  CSP:  each  variable  in  the  Boolean  expression  corresponds  to  a 
variable  in  the  CSP  with  a  Boolean  domain,  and  each  clause  of  i  literals  is  a  constraint 
disallowing  exactly  one  truth  assignment  to  the  i  variables  mentioned.  SAT  restricted 
to  conjunctive  normal  form  with  exactly  k  literals  per  clause  is  known  as  A: -SAT.  A 
common  restriction  of  SAT  that  retains  its  NP-completeness  is  3-SAT. 

By  problem,  we  mean  an  abstract  description  such  as  the  definition  for  CSP,  SAT, 
or  3-SAT  which  can  be  instantiated  in  different  concrete  ways  —  e.g.,  by  enumerating 
specific  variables  and  constraints.  By  instance,  we  mean  one  of  these  particular  instan¬ 
tiations.  By  problem  space,  we  mean  a  parameterized  set  of  problems,  where  each 
parameter  represents  a  dimension  of  the  space.  Thus,  one  point  in  the  larger  space  of  3- 
SAT  is  3-SAT  with  exactly  75  variables  and  325  clauses.  In  a  random  problem  space, 
the  probability  distribution  for  the  occurrence  of  a  particular  instance  at  any  point 
depends  on  the  operation  of  a  non-deterministic  procedure  given  the  parameter  values 
for  that  point  as  inputs.  The  procedure  for  the  Random  3-SAT  problem  space  is  given 
below. 

Random  3-SAT:  Inputs  are  the  number  of  variables  n  and  the  number  of 

clauses  m .  Three  distinct  variables  are  randomly  selected  out  of  the 
pool  of  n  possible  variables.  Each  variable  is  negated  with  probabil¬ 
ity  1  /2 .  These  literals  are  combined  to  form  a  clause,  m  clauses  are 
created  in  this  manner  and  conjoined  to  form  the  3-CNF  Boolean 
expression. 

For  scaling  across  different  problem  sizes  (different  values  of  « ),  we  use  the  con¬ 
straint  ratio  m/ n  which  is  expressed  in  units  of  clauses  per  variable.  Instances  with 
high  median  difficulty  can  be  found  at  the  crossover  point,  occurring  where  half  the 
generated  instances  are  satisfiable.  The  crossover  point  may  be  thought  of  as  the  mid¬ 
point  of  the  phase  transition  region.  It  turns  out  that  the  location  of  the  crossover  point 
is  fairly  stable  in  constraint  ratio  terms  -  around  4.26  for  Random  3-SAT  [5]. 

3  Tableau  and  Look-Back  Enhancements 

We  use  Tableau  [5]  as  our  baseline  SAT  algorithm.  Crawford  and  Auton  show  Tab¬ 
leau  to  be  very  effective  at  solving  instances  from  Random  3-SAT  and  at  overcoming 
the  incidence  of  EHIs  in  the  under-constrained  region.  For  a  full  discussion  of  the  heu- 
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ristics  which  lead  to  its  success,  please  see  [5].  We  create  three  look-back-enhanced 
versions  of  Tableau:  one  applying  conflict-directed  backjumping  (CBJ)  [23],  another 
CBJ  with  third-order  learning  [12],  and  the  last  CBJ  with  unrestricted  learning  (some¬ 
times  referred  to  as  “dependency-directed  backtracking”  [30]). 

As  look-back  techniques,  backjumping  and  learning  are  invoked  when  the  algo¬ 
rithm  reaches  a  failure  point  where  at  least  one  variable  assignment  must  be  undone 
before  search  can  progress.  Both  exploit  a  set  of  “culprit”  variables  whose  assignments 
are  determined  to  be  responsible  for  the  failure.  The  method  used  to  identify  the  set  of 
culprits  is  critical  in  the  effectiveness  of  the  techniques.  The  culprit  identification 
scheme  used  by  Prosser’s  conflict  directed  backjumping  is  widely  used  [1,  12,  28], 
requires  little  overhead  [28],  and  is  provably  more  effective  than  some  of  its  predeces¬ 
sors  [20].  Given  a  culprit  identification  scheme,  the  next  issue  to  be  decided  is  how  to 
exploit  the  culprits.  Pure  CBJ  simply  backs  up  to  the  most  recent  culprit  to  have  been 
assigned  a  value  without  recording  the  culprit  assignments.  At  the  other  extreme  is 
unrestricted  learning  which  records  every  assignment  of  culprit  variables  (called  a 
nogood).  A  useful  middle-ground  is  to  apply  CBJ  and  to  record  nogoods  only  if  they 
are  below  a  certain  size.  For  instance,  third-order  learning  [12]  records  only  those 
nogoods  mentioning  three  or  fewer  variables.  In  the  SAT  context,  this  corresponds  to 
recording  derived  clauses  of  three  or  fewer  literals. 

In  the  experiments  that  follow,  we  concentrate  on  CBJ  and  bounded  learning 
enhancements  of  Tableau.  We  experiment  briefly  with  unrestricted  learning,  but  find  it 
too  expensive  on  the  more  difficult  instances. 

4  Regularity-Inducing  3-SAT  Generators 

Tableau  and  other  modern  SAT  algorithms  exploit  irregularities  within  the  search 
space  to  realize  inevitable  dead-ends  as  quickly  as  possible.  We  were  interested  in 
identifying  the  effects  of  highly  regular  instances  on  Tableau  and  its  enhancements. 
Various  other  studies  [6,  13,  29,  31]  have  investigated  the  effects  of  increased  regular¬ 
ity  on  SAT  and  CSP  solving,  finding  that  higher  regularity  increases  mean  difficulty. 
While  look-back  techniques  do  not  significantly  improve  Tableau’s  mean  performance 
on  the  instances  below,  we  discover  interesting  phase  transition  properties  which  we 
exploit  to  develop  a  harder  problem  generator  in  the  following  section. 

We  first  define  two  new  generation  procedures  based  on  Random  3-SAT  that  pro¬ 
gressively  eliminate  certain  sources  of  irregularity.  An  obvious  potential  irregularity 
within  a  Random  3-SAT  instance  is  that  some  variables  may  appear  more  often  than 
others.  The  following  generation  procedure  removes  this  irregularity  nearly  com¬ 
pletely: 

Variable-Regular  3-SAT:  Inputs  are  the  number  of  variables  (n )  and  the 
number  of  clauses  (m ).  The  instance  is  constructed  by  putting 
L3(m/n)  J  occurrences  of  each  variable  in  a  “bag”.  A  random  set  of 
unique  variables  is  then  added  to  the  bag  so  that  there  are  exactly 
3m  variables  in  it.  To  construct  each  clause,  three  distinct  variables 
are  removed  from  the  bag.  Each  variable  is  negated  with  probability 
1/2  to  form  a  clause,  and  clauses  are  conjoined  to  form  the  3-CNF 
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expression.  If  there  are  only  one  or  two  distinct  variables  remaining 
within  the  bag,  additional  distinct  variables  are  selected  randomly 
from  the  set  of  all  variables. 

Since  variables  are  negated  at  random  in  Variable-Regular  3-SAT,  a  given  variable 
may  appear  negated  more  often  than  not  (or  vice  versa).  The  next  generation  procedure 
removes  this  source  of  irregularity  nearly  completely.  This  problem  space  is  equivalent 
to  the  “doubly  balanced”  SAT  space  investigated  independently  by  Dubois  and 
Boufkhad  [10],  and  similar  to  those  defined  by  Genisson  and  Sais  [13]. 

Literal-Regular  3-SAT:  Inputs  are  the  number  of  variables  {n)  and  the 
number  of  clauses  (m ).  There  are  2n  possible  literals  given  n  vari¬ 
ables,  so  L3m/2rt  J  occurrences  of  each  literal  are  placed  in  a  bag, 

A  random  set  of  unique  literals  is  then  added  to  the  bag  so  that  there 
are  exactly  3  m  literals  in  it.  To  construct  each  clause,  three  literals 
on  distinct  variables  are  removed  from  the  bag.  If  there  are  only  1  or 
2  distinct  variables  mentioned  in  literals  remaining  the  bag,  addi¬ 
tional  distinct  variables  are  randomly  selected  from  the  set  of  all 
variables  and  negated  with  probability  1  /2 . 

Data  on  the  location  of  the  phase  transition  and  mean  problem  difficulty  with 
respect  to  Tableau  for  instances  generated  by  the  regularity-inducing  generators  appear 
in  Figure  1.  Each  point  plotted  in  both  graphs  results  from  500  instances  solved  by  our 
implementation  of  Tableau. 

While  both  generators  increase  regularity,  one  exhibits  a  phase  transition  to  the 
right  of  Random  3-S AT’s,  and  the  other  to  the  left.  The  first  graph  in  the  figure  displays 
the  phase  transition  properties  for  each  procedure  when  n  is  fixed  at  140.  Smith  and 
Dyer  [29]  find  a  similar  rightward  shift  with  CSPs  when  decreasing  constraint-graph 
regularity.  They  point  out  that  it  is  more  difficult  to  assign  a  value  to  a  highly  con¬ 
strained  variable  than  to  a  less  constrained  one;  thus,  greater  variability  in  constraint 
graph  degree  should  lead  on  average  to  greater  frequency  of  unsatisfiability  for  given 
n  and  m .  This  helps  to  explain  why  variable-regularity  shifts  the  phase  transition  to 
the  right  from  Random  3-SAT.  Genisson  and  Sais  [13]  reported  a  similar  leftward  shift 
with  their  literal-regular  3-SAT  generator.  We  believe  that  literal-regular  instances  typ¬ 
ically  require  fewer  clauses  for  unsatisfiability  because  the  balance  of  positive  and  neg¬ 
ative  variable  occurrences  provides  more  opportunities  for  resolution,  leading  to  a 
greater  probability  of  ultimately  deriving  the  empty  clause. 

Also  displayed  in  Figure  1  are  mean  problem  difficulty  at  various  values  of  input 
parameters.  Difficulty  is  represented  by  the  number  of  branch  points  encountered  by 
Tableau.  Branch  points  are  defined  as  search  tree  nodes  where  Tableau  does  a  signifi¬ 
cant  amount  of  work  (i.e.  more  than  just  unit  propagation)  in  branching  on  both  possi¬ 
ble  truth  values  [5].  As  expected,  mean  difficulty  at  crossover  increased  with 
regularity.  Literal-Regular  3-SAT  exhibits  the  highest  mean  difficulty;  at  this  relatively 
low  value  of  n ,  its  peak  is  almost  an  order  of  magnitude  higher  than  that  of  Random  3- 
SAT. 

We  repeated  the  experiments  at  larger  and  smaller  values  of  n  in  order  to  see  how 
the  phase  transition  and  mean  difficulty  changed,  this  time  averaging  over  1000 
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ClausBs 

Fig,  1.  Phase  transitions  and  search  space  explored  at  n=140. 


instances  per  data  point.  Recall  that  for  Random  3-SAT  the  location  of  the  crossover 
point  is  fairly  stable  at  a  constraint  ratio  near  4.26.  The  graphs  in  Figure  2  illustrate  the 
crossover  points  for  the  other  problem  spaces,  and  the  respective  difficulty  at  the  cross¬ 
over  point.  Crossover  point  locations  for  these  new  generators  are  also  stable  in  con¬ 
straint  ratio  terms.  We  derived  a  crossover  point  constraint  ratio  of  4.41  for  Variable- 
Regular  3-SAT  and  3.54  for  Literal-Regular  3-SAT. 

Measuring  difficulty  in  branch  points  explored  by  Tableau,  it  is  known  that  diffi¬ 
culty  of  crossover  point  problems  from  Random  3-SAT  approximately  doubles  every 
time  the  number  of  variables  is  increased  by  20  (at  least  up  to  n  =  300 )  [5].  For  Vari¬ 
able-Regular  3-SAT,  we  see  that  difficulty  increases  by  a  factor  of  2.4  with  an  incre¬ 
ment  of  20  variables  (within  the  range  explored).  For  Literal-Regular  3-SAT,  the 
difficulty  increases  with  a  factor  of  approximately  2.9. 

We  added  conflict-directed  backjumping  and  third-order  learning  to  Tableau,  antic¬ 
ipating  that  learning  (which  derives  new  clauses  during  search)  could  create  irregulari¬ 
ties  for  the  variable  ordering  heuristics  to  exploit.  We  found  that  neither  scheme 
improved  runtime  nor  reduced  search  space  explored  beyond  a  few  percentage  points. 
Tableau,  while  performing  worse  on  these  instances  than  typical  Random  3-SAT 
instances,  has  a  respectable  growth  rate  when  compared  to  naive  DP  variants.  For 
instance,  Crawford  and  Auton  [4]  find  that  DP  using  a  most-constrained-first  variable 
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Fig.  2.  Crossover  point  locations  and  difficulty  at  the  crossover  point. 


selection  heuristic  requires  over  10,000  branch  points  for  Random  3-SAT  problems 
with  as  few  as  100  variables.  After  assigning  a  few  variables,  we  suspect  enough  irreg¬ 
ularities  appear  in  the  resulting  sub-problems  to  make  Tableau’s  look-ahead  heuristics 
effective.  The  irregularities  created  by  learning  were  not  significant  in  comparison. 


5  Manufacturing  and  Solving  Exceptionally  Hard  Problems 

Several  researchers  [1,  14,  15,  17]  have  found  rare  instances  in  the  under-con¬ 
strained  region  of  various  problem  spaces  so  difficult  as  to  render  the  mean  difficulty 
higher  than  that  of  instances  from  the  crossover  point.  Some  of  these  instances  have 
been  found  to  contain  small  unsatisfiable  sub-problems  [15].  In  this  section,  instead  of 
randomly  pnerating  instances  from  the  under-constrained  region  of  a  particular  prob¬ 
lem  space  in  search  of  exceptionally  hard  instances,  we  actively  generate  them  by  cre¬ 
ating  under-constrained  instances  with  small  unsatisfiable  sub-problems. 

A  simple  approach  for  generating  an  under-constrained  instance  containing  an 
unsatisfiable  sub-problem  is  to  take  the  union  of  the  clauses  from  an  under-constrained 
instance  and  an  unsatisfiable  one.  However,  most  instances  created  using  this  approach 
have  surface  characteristics  that  render  them  easy.  For  example,  if  the  two  combined 
instances  consist  of  disjoint  variables,  a  simple  connected  components  algorithm  could 
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be  used  to  identify  each  sub-problem  for  their  independent  solution.  Without  such  a 
preprocessing  phase,  we  have  found  that  Tableau  can  perform  poorly  on  such  instances 
if  the  variable  occurring  most  frequently  appears  in  the  under-constrained  instance: 
Tableau  will  initially  attempt  to  solve  the  under-constrained  instance,  and  for  each 
solution,  it  attempts  (and  fails)  to  solve  the  unsatisfiable  one.  The  addition  of  conflict- 
directed  backjumping  completely  remedies  this  behavior,  since  it  allows  each  sub¬ 
problem  to  be  effectively  solved  independently  in  conjunction  with  the  most-con- 
strained  first  variable  ordering  heuristic.  If  the  variable  names  of  the  combined 
instances  are  overlapping,  then  variables  shared  between  the  instances  almost  always 
occur  most  frequently  (unless  additional  steps  are  taken  to  prevent  this).  This  causes 
Tableau  to  branch  initially  on  those  variables,  allowing  it  to  determine  unsatisfiability 
without  exceptional  difficulty. 

In  this  section,  we  show  how  the  phase  transition  characteristics  of  the  regular 
problem  spaces  can  be  exploited  to  conveniently  generate  under-constrained  instances 
containing  well-concealed  small  unsatisfiable  sub-problems.  The  procedure  is  used  to 
generate  instances  whose  constraint  ratios  suggest  they  should  be  easily  satisfiable 
with  reference  to  another  problem  space.  Instead,  they  frequently  turn  out  to  be  excep¬ 
tionally  hard.  We  find  that  look-back  enhancements  provide  a  substantial  degree  of 
insurance  against  poor  performance,  though  they  fail  to  eliminate  it  completely. 

In  principle,  an  instance  can  be  “exceptionally  hard”  only  with  reference  to  a  given 
algorithm  and  a  particular  problem  space.  For  example,  looking  at  Figure  2b,  crossover 
instances  from  Literal-Regular  3-SAT  would  be  considered  hard  with  reference  to 
Random  3-SAT.  As  a  convention,  we  take  an  EHI  to  be  any  instance  which  requires  at 
least  an  order  of  magnitude  more  work  than  the  mean  difficulty  of  crossover  instances 
in  the  reference  problem  space.  Here,  we  use  Variable-Regular  3-SAT  as  our  reference 
space  since  the  generator  produces  only  variable  regular  instances.  In  our  experiments 
with  Variable-Regular  3-SAT  and  those  with  Random  3-SAT,  we  found  no  instances  at 
crossover  requiring  more  than  5  times  the  mean  number  of  branch  points,  so  our  EHIs 
are  much  harder  than  any  of  the  observed  crossover  instances. 

The  procedure  below  conceals  a  problem  instance  within  a  larger  one.  It  uses  the 
Variable-Regular  3-SAT  procedure  to  embed  the  input  instance  P'  with  n'  variables 
and  m'  clauses  within  a  larger  randomly  generated  instance  of  size  n,  m . 

Embedding  Procedure:  Inputs  are  an  instance  F  of  size  n\  m\  and 

parameters  n,  m  specifying  the  size  of  the  instance  to  be  generated. 

The  number  of  occurrences  of  any  variable  in  F  is  required  to  be  no 
more  than  13 .  The  variables  of  P'  are  first  renamed  ran¬ 
domly  to  variables  within  the  set  of  n  variables  to  appear  in  the  gen¬ 
erated  instance.  Next,  a  bag  is  filled  with  variable  occurrences 
exactly  as  is  done  by  Variable-Regular  3-SAT  with  parameters  n,  m . 

Then,  for  every  occurrence  of  a  variable  appearing  in  the  renamed 
m'  clauses,  an  occurrence  of  that  variable  is  removed  from  the  bag. 
Afterwards,  the  remaining  m-m'  clauses  are  generated  as  defined 
by  the  Variable-Regular  instance  generator. 
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If  F  is  unsatisfiable,  then  the  resulting  problem  will  also  be  unsatisfiable  since  it 
contains  the  (renamed)  clauses  of  F .  The  procedure  below  uses  the  technique  to  cre¬ 
ate  unsatisfiable  variable-regular  instances. 

Exceptionally  Hard  3-SAT:  Inputs  are  four  parameters,  n,  m,  n\  m’ .  We 
begin  by  using  the  Literal-Regular  3-SAT  procedure  to  generate  an 
instance  F  of  size  n\m\  The  procedure  is  invoked  until  an  unsatis¬ 
fiable  instance  is  produced.  Then,  clauses  are  greedily  removed  at 
random  from  F  until  there  is  no  single  clause  that  can  be  removed 
without  rendering  the  instance  satisfiable.  At  this  point,  the  reduced 
F  is  checked  to  ensure  no  variable  appears  more  than  L3(m/«)  J 
times.  If  a  variable  appears  too  often,  then  we  start  over.  Otherwise, 
we  embed  P'  into  a  size  «,  m  instance  using  the  previously- 
described  embedding  procedure. 

The  above  procedure  must  find  an  unsatisfiable  instance  that  can  be  effectively  con¬ 
cealed  by  way  of  low  variable  occurrences.  Its  strategy  for  maximizing  the  probability 
of  concealment  is  as  follows.  First,  it  selects  an  unsatisfiable  instance  from  Literal- 
Regular  3-SAT.  On  average,  these  require  fewer  clauses  for  unsatisfiability  than 
instances  from  Random  3-SAT  and  Variable-Regular  3-SAT  due  to  the  left-shifted 
phase  transition.  The  procedure  then  applies  the  greedy  reduction  phase  to  make  the 
instance  even  easier  to  hide  (we  find  that  reduction  typically  removes  20-40%  of  the 
clauses  from  unsatisfiable  crossover  instances).^  In  the  embedding  phase,  the  reduced 
instance  is  padded  with  clauses  that  suffice  to  make  the  result  a  possible  output  of  Vari¬ 
able-Regular  3-SAT.  The  right-shifted  phase  transition  of  Variable-Regular  3-SAT 
allows  padding  with  more  clauses  than  Literal-Regular  or  Random  3-SAT  for  produc¬ 
ing  what  superficially  appear  to  be  under-constrained  instances. 

The  following  tables  report  the  performance  of  Tableau  and  its  enhancements 
on  instances  from  Exceptionally  Hard  3-SAT.  Tableau  is  denoted  by  “Tab”,  Tableau 
with  conflict-directed  backjumping  “Tab+CBJ”  and  Tableau  with  conflict-directed 
backjumping  and  third-order  learning  “Tab+CBJ+lm”.  We  continue  to  report  problem 
difficulty  in  terms  of  branch  points  because  overhead  of  these  additional  enhancements 
was  small.  The  overhead  of  conflict-directed  backjumping  alone  was  negligible  (less 
than  3%)  since  Tableau  expends  most  of  its  effort  selecting  branch  variables.  Learning 
derives  new  clauses  which  had  to  be  tested  even  by  the  branch- variable  selection  pro¬ 
cedure.  Third-order  learning,  however,  typically  recorded  only  a  few  clauses,  keeping 
overhead  well  within  20%  even  on  the  hardest  problems.  At  the  end  of  this  section,  we 
discuss  preliminary  experiments  with  unrestricted  learning. 

We  used  a  constraint  ratio  of  3.5  or  less  for  determining  m  for  the  various 
values  of  n  since  it  is  well  within  the  under-constrained  region  of  Variable-Regular  3- 


1  This  phase  is  NP-hard,  though  it  presents  no  practical  problem  as  long  as  we  choose  to 
embed  small  instances.  For  greater  efficiency,  we  could  embed  an  instance  selected  from  a 
set  of  pre-reduced  instances  instead  of  generating  and  reducing  a  new  instance  for  embed¬ 
ding  with  each  invocation  of  the  Exceptionally  Hard  3-SAT  procedure. 
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SAT.  For  n\m\  we  used  the  formula  n  =  3.5m  +  5  to  produce  hard  Literal-Regular 
3 -SAT  instances  for  embedding. 

To  facilitate  experiments  with  large  numbers  of  instances  (10,000  per  value  of 
n ),  we  had  the  algorithm  halt  and  report  failure  beyond  a  threshold  of  branch  points  an 
order  of  magnitude  or  more  than  the  mean  required  for  Variable-Regular  crossover 
instances  with  the  same  number  of  variables  (10,000  branch  points  for  n  -  15  and 
n  =  140 , 50,000  branch  points  for  n  ~  200 ).  We  also  report  the  mean  difficulty  of  a 
smaller  number  (200)  of  problems  on  which  the  failure  mechanism  was  turned  off. 


Table  1.  Exceptionally  hard  problem  statistics  for  n=75,  m=225,  n’=10,  m’=40 


Mean  Difficulty  of 
Solved  Instances 

Failure  Rate 

Mean  Difficulty  of 
200  instances 

Algorithm 

(branch  points) 

(%)  [>  10000] 

(branch  points) 

Tableau 

2,672 

66.5 

87,993 

Tab  +  CBJ 

113 

0 

113 

Tab  +  CBJ  +  Im 

3 

0 

3 

Table  1  shows  that  unenhanced  Tableau  performs  poorly  even  on  very  small 
instances  from  Exceptionally  Hard  3-SAT.  For  Variable  -Regular  crossover  instances 
with  n  =  75  ,  Tableau  requires  a  mere  20  branch  points  on  average.  Performance  on 
75  variable  Exceptionally  Hard  3-SAT  instances  is  over  3  orders  of  magnitude  worse. 
While  Tableau  with  backjumping  is  effective  at  solving  these  problems,  the  addition  of 
learning  makes  them  trivial.  For  problems  barely  beyond  n  =  75 ,  we  found  that  Tab¬ 
leau’s  failure  rate  rapidly  approached  100%. 


Table  2.  Exceptionally  hard  problem  statistics  for  n=140,  m=490,  n’=20,  m’=75 


Algoritbm 
Tab  +  CBJ 
Tab  +  CBJ  +  Im 


Mean  Difficulty  of 

Solved  Instances  Failure  Rate 

(branch  points)  (%)  [>  10000] 

1,904  57.2 

12  0 


Mean  Difficulty  of 
200  instances 
(branch  points) 

55,122 

12 


The  next  data  point  (Table  2)  illustrates  that  Tableau  with  conflict-directed 
backjumping  alone  begins  to  go  awry  at  large  enough  problems.  Others  [15,  28]  have 
also  found  that  backjumping  alone  fails  to  eliminate  occurrence  of  exceptionally  hard 
instances,  though  in  the  context  of  sparse  CSP  or  much  larger  SAT  instances. 
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Table  3.  Exceptionally  hard  problem  statistics  for  n=200,  m=700,  n’=30,  m’=l  10 

Mean  Difficulty  of  Solved  Failure  Rate  (%) 

Algorithm  Instances  (branch  points)  [>  50000] 

Tab  +  CBJ  +  Im  78  0 

Table  3  shows  that  the  learning  algorithm  remains  completely  effective  even 
at  larger  problem  sizes  when  the  embedded  instance  is  small.  The  failure  cutoff  is 
increased  to  50,000  branch  points  for  these  larger  problems  since  the  average  Variable- 
Regular  crossover  instance  of  200  variables  requires  approximately  5(X)0  branch 
points.  It  appears,  however,  that  if  the  embedded  and  actual  instances  are  large  enough, 
we  can  elicit  failure  even  in  the  learning  algorithm  (Table  4  below).  The  failure  rate 
remains  relatively  low  at  the  data  points  we  investigated.  Further  experimentation  is 
required  in  order  to  determine  if  the  failure  rate  approaches  100%  with  larger  problems 
(as  is  clearly  the  case  with  the  other  two  algorithms). 

Table  4.  Exceptionally  hard  problem  statistics  for  n=200,  m=700,  n’=50,  m’=180 

Mean  Difficulty  of  Solved  Failure  Rate  ( % ) 

Algorithm  Instances  (branch  points)  [>  50000] 

Tab  +  CBJ  +  Im  3,702  3.4945 

We  have  performed  some  experiments  with  unrestricted  learning,  but  have 
been  unable  to  draw  any  solid  conclusions  about  its  effects  other  than  that  its  overhead 
becomes  unacceptably  high  on  sufficiently  large  and  dense  SAT  instances.  For  exam¬ 
ple,  on  difficult  instances  from  Exceptionally  Hard  3-SAT,  the  third-order  learning 
algorithm  was  30  times  faster  in  terms  of  branch  points  searched  per  second.  On  50 
instances  from  the  <200,  700,  50,  180>  point,  the  hardest  instance  found  for  the  unre¬ 
stricted  learning  algorithm  required  22,405  branch  points.  This  translates  to  well  over 
10  times  the  CPU  time  that  the  third-order  learning  algorithm  required  to  reach  failure. 
To  account  for  this  overhead,  we  feel  the  definition  of  an  exceptionally  hard  instance 
for  unrestricted  learning  algorithms  should  take  CPU  time  into  account.  By  such  a  def¬ 
inition,  our  implementation  of  unrestricted  learning  fails  to  eliminate  them  completely. 

Baker  [1]  and  Frost  and  Dechter  [12]  found  that  the  overhead  of  their  unre¬ 
stricted  learning  algorithms  was  not  excessive.  We  believe  the  our  different  findings 
are  primarily  due  to  the  constraint  density  of  SAT  compared  to  binary  CSR  3-SAT 
instances  require  many  constraints  (clauses),  since  each  excludes  only  a  small  fraction 
of  potential  truth  assignments.  Further,  each  constraint  is  defined  on  three  variables 
instead  of  two.  As  a  result,  the  set  of  variables  responsible  for  each  failure  when  solv¬ 
ing  a  SAT  instance  is  often  large.  Another  potential  cause  for  our  different  findings  is 
that  some  instances  produced  by  Exceptionally  Hard  3-SAT  required  extensive  search 
even  of  the  unrestricted  learning  algorithm.  Baker’s  instances  were  easy  for  his  unre¬ 
stricted  learning  algorithm,  so  it  never  had  the  opportunity  to  derive  an  excessive  num¬ 
ber  of  constraints.  We  believe  that  any  SAT  algorithm  applying  learning  on  instances 
like  ours  will  require  either  limited  order  learning  as  employed  here,  time  or  relevance 
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limits  on  derived  clauses  [2,16],  or  some  method  for  efficiently  producing  smaller  cul¬ 
prit  sets  than  conflict-directed  backjumping  (e.g.  possibly  along  the  lines  of  Dechter’s 
deep  learning  schemes  [9]). 

6  Related  and  Future  Work 

Ginsberg  &  McAllester  [16]  evaluate  a  CSP  algorithm  they  call  “partial-order 
dynamic  backtracking”  on  a  3-SAT  problem  space  with  restricted  structure.  This  prob¬ 
lem  space  creates  an  instance  by  arranging  the  variables  on  a  two-dimensional  grid  and 
creating  clauses  that  contain  variables  forming  a  triangle  with  two  sides  of  unit  Euclid¬ 
ean  length.  The  algorithm  incorporates  look-ahead  techniques  not  specifically  geared 
for  SAT  and  look-back  techniques  similar  to  CB J  and  learning.  They  find  it  immune  to 
pathologies  encountered  by  Tableau  on  these  instances.  The  hardest  problems  used  in 
their  evaluation  were  crossover  instances  from  their  new  SAT  problem  space  with  625 
variables.  We  found  these  instances  trivial  for  Tableau  with  CBJ  and  third-order  learn¬ 
ing,  requiring  on  average  6  and  at  maximum  28  branch  points  on  the  10,000  instances 
we  attempted.  We  found  that  CBJ-enhanced  Tableau  has  occasional  difficulties  on 
these  same  instances,  requiring  less  than  22  branches  on  over  half  the  instances,  but 
over  50,000  branches  on  2.51%  of  them.  This  suggests  these  instances  may  have  prop¬ 
erties  similar  to  (though  not  as  pronounced  as)  those  from  Exceptionally-Hard  3-SAT. 

Mazure  et  al.  [22]  recently  developed  a  look-ahead  technique  for  DP  based  on 
GSAT  [26].  The  technique  selects  branch  variables  by  counting  the  number  of  times  a 
variable  occurs  in  clauses  falsified  by  assignments  made  during  a  GSAT  search  on  the 
current  sub-problem.  The  intent  is  to  focus  search  on  variables  that  may  be  part  of  an 
inconsistent  kernel.  They  evaluate  their  technique  on  several  DIMACS  benchmark 
instances^  which  were  infeasible  for  DP.  Some  of  the  “AIM”  instances  from  this  suite 
that  are  difficult  for  DP  are  trivial  for  their  algorithm.  We  found  that  all  of  the  largest 
(200  variable)  “AIM”  instances  were  trivial  for  Tableau  with  CBJ  and  learning,  the 
most  difficult  requiring  27  branch  points.  Some  of  these  instances  were  difficult  for 
Tableau  without  learning  but  with  CBJ.  We  have  not  yet  explored  the  effect  of  their 
technique  on  our  problem  spaces. 

Lee  and  Plaisted  [21]  enhance  DP  with  a  backjumping  scheme  similar  to  (but  not 
as  powerful  as)  CBJ  which  they  use  as  a  subroutine  in  a  first-order  theorem  proving 
system.  Gent  and  Walsh  [15]  experiment  with  an  implementation  of  this  algorithm^  on 
a  SAT  problem  space  they  call  “Constant  Probability”  and  find  that  the  backjumping 
scheme  reduces  the  incidence  of  EHIs  in  the  under-constrained  region,  but  fails  to 
eliminate  them  at  large  enough  problem  sizes.  Chvatal  [3]  also  applies  look-back  to 
SAT  through  “resolution  search”  —  a  DP  variant  with  what  appears  to  be  a  novel  learn¬ 
ing  scheme. 


1  These  instances  are  available  through  anonymous  FTP  at  ftp:dimacs.rutgers.edu  within  direc¬ 
tory  pub/challenge/sat/benchmarks/cnf. 

2  This  is  the  “intelligent  backtracking”  algorithm  whose  implementation  they  credit  to  Mark 
Shekel. 
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Others  have  developed  procedures  intended  to  generate  hard  random  problems. 
Iwama  et  al.  [19]  and  Rauzy  [24]  independently  generate  3-SAT  instances  which  are 
known  in  advance  to  be  satisfiable  or  unsatisfiable.  The  authors  do  not  directly  com¬ 
pare  the  difficulty  of  these  problems  to  problems  from  Random  3-SAT.  Genisson  and 
Sais  [13]  investigate  3-SAT  generators  with  controlled  distributions  of  literals  -  much 
like  our  Literal-Regular  3-SAT  —  and  also  find  a  left-shifted  phase  transition  and 
increased  difficulty.  Dubois  and  Boufkhad  also  investigate  literal-regular  instances 
they  call  “doubly  balanced”  in  a  forthcoming  paper  [10].  Culberson  et  al.  [6]  and 
Vlasie  [31]  describe  generators  for  graph  coloring  CSPs  in  which  graphs  are  endowed 
with  a  variety  of  structural  properties  intended  to  make  coloring  difficult.  Their  empir¬ 
ical  results  cannot  be  compared  to  ours  directly,  but  they  also  find  that  increased  regu¬ 
larity  increases  mean  problem  difficulty.  Whether  look-back  techniques  are  important 
for  these  problem  spaces  remains  to  be  evaluated  empirically. 

Crawford  and  Auton  [5]  were  unable  to  find  any  EHIs  for  Tableau  in  the  under¬ 
constrained  region  of  Random  3-SAT.  Our  results  with  Exceptionally  Hard  3-SAT 
show  that  they  do  exist,  albeit  in  Random  3-SAT  with  low  probability.  Gent  and  Walsh 
report  that  an  improved  DVO  heuristic  [14]  and  improved  constraint  propagation 
method  [15]  (both  look-ahead  techniques)  fail  to  eliminate  EHIs  for  their  3-SAT  gener¬ 
ators  and  DP  implementation.  Baker  [1],  working  with  graph  coloring  CSPs,  finds  no 
problems  to  be  extremely  hard  for  a  conflict-directed  backjumping  and  unrestricted 
learning  algorithm.  Selman  and  Kirkpatrick  [25],  using  an  earlier  version  of  Tableau 
[4]  and  Random  3-SAT,  investigate  the  incidence  of  EHIs  when  a  given  instance  is 
subject  to  an  equivalence-preserving  random  renaming  of  its  variables.  They  report 
observing  the  same  incidence  of  EHIs  whether  running  Tableau  on  5000  different  per¬ 
mutations  of  the  same  20  source  instances,  or  whether  sampling  5000  instances  inde¬ 
pendently.  This  suggests  these  EHIs  arise  on  account  of  unfortunate  variable  orderings. 
We  have  not  yet  investigated  the  effects  of  random  renamings  on  our  instances.  The 
fact  that  they  occur  with  near  certainty  for  unenhanced  Tableau  when  generating  suffi¬ 
ciently  large  problems  suggests  that  the  source  of  their  difficulty  may  be  qualitatively 
different. 

The  fact  that  our  generated  EHIs  are  unsatisfiable  means  that  sound  but  incomplete 
algorithms  such  as  GSAT  [26]  cannot  be  used  to  address  them.  We  are  considering  a 
related  problem  generator  of  satisfiable  instances  for  the  purpose  of  benchmarking 
sound-and-incomplete  SAT  algorithms.  We  embed  hard  satisfiable  instances  instead  of 
unsatisfiable  ones  by  removing  one  additional  clause  immediately  following  the  reduc¬ 
tion  phase  of  Exceptionally  Hard  3-SAT.  Limited  experimentation  has  shown  similar 
(though  not  as  pronounced)  difficulties  result  for  unenhanced  Tableau,  even  though  the 
resulting  instances  are  almost  always  satisfiable. 

7  Conclusions 

We  have  shown  that,  contrary  to  the  conventional  wisdom,  CSP  look-back  tech¬ 
niques  are  useful  for  SAT.  We  were  able  to  significantly  reduce  the  incidence  of  excep¬ 
tionally  hard  instances  (EHIs)  encountered  by  the  Tableau  SAT  algorithm  after 
enhancing  it  with  look-back  techniques.  We  devised  a  new  generator  which  usually 
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succeeds  in  producing  instances  which  are  exceptionally  hard  for  unenhanced  Tableau 
when  compared  to  the  difficulty  of  other  common  benchmark  instances.  Relatively  few 
of  these  instances  remained  exceptionally  hard  for  the  look-back-enhanced  versions  of 
Tableau. 

It  may  be  that  look-back  techniques  are  essential  to  solving  these  instances  effi¬ 
ciently,  though  we  encourage  experimentation  with  non-look-back  algorithms  which 
might  refute  this  hypothesis.  At  the  same  time,  we  do  not  wish  to  minimize  the  signifi¬ 
cance  of  look-ahead  techniques.  Tableau’s  variable  selection  heuristics  make  it  com¬ 
petitive  with  the  best  current  SAT  algorithms,  and  we  would  expect  to  encounter  many 
more  exceptionally  hard  instances  without  them.  We  do  not  believe  that  any  generally 
effective  SAT  algorithm  can  totally  eradicate  the  incidence  of  EHIs.  We  do  believe  that 
selected  look-ahead  and  look-back  techniques  with  modest  overhead  can  provide  some 
valuable  insurance  against  them. 
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Abstract,  In  the  last  twenty  years,  many  algorithms  and  heuristics  were  devel¬ 
oped  to  find  solutions  in  constraint  networks.  Their  number  increased  to  such  an 
extent  that  it  quickly  became  necessary  to  compare  their  performances  in  order  to 
propose  a  small  number  of  “good”  methods.  These  comparisons  often  led  us  to 
consider  FC  or  FC-CBJ  associated  with  a  “minimum  domain”  variable  ordering 
heuristic  as  the  best  techniques  to  solve  a  wide  variety  of  constraint  networks. 

In  this  paper,  we  first  try  to  convince  once  and  for  all  the  CSP  community  that 
MAC  is  not  only  more  efficient  than  FC  to  solve  large  practical  problems,  but  it  is 
also  really  more  efficient  than  FC  on  hard  and  large  random  problems.  Afterwards, 
we  introduce  an  original  and  efficient  way  to  combine  variable  ordering  heuristics. 
Finally,  we  conjecture  that  when  a  good  variable  ordering  heuristic  is  used,  CBJ 
becomes  an  expensive  gadget  which  almost  always  slows  down  the  search,  even 
if  it  saves  a  few  constraint  checks. 


1  Introduction 

Constraint  satisfaction  problems  {CSPs)  occur  widely  in  artificial  intelligence.  They  in¬ 
volve  finding  values  for  problem  variables  subject  to  constraints  on  which  combinations 
are  acceptable.  For  simplicity  we  restrict  our  attention  here  to  binary  CSPs,  where  the 
constraints  involve  two  variables. 

Binary  constraints  are  binary  relations.  If  a  variable  i  has  a  domain  of  potential  values 
Di  and  a  variable  j  has  a  domain  of  potential  values  Dj,  the  constraint  on  i  and  j,  Rij, 
is  a  subset  of  the  Cartesian  product  of  Di  and  Dj.  If  the  pair  of  values  a  for  i  and  h  for 
j  is  acceptable  to  the  constraint  Rij  between  i  and  j,  we  will  call  the  values  consistent 
(with  respect  to  Rij).  Asking  whether  a  pair  of  values  is  consistent  is  called  a  constraint 
check. 

The  entity  involving  the  variables,  the  domains,  and  the  constraints,  is  called  con¬ 
straint  network.  Any  constraint  network  can  be  associated  to  a  constraint  graph  in  which 
the  nodes  are  the  variables  of  the  network,  and  an  edge  links  a  pair  of  nodes  if  and  only 
if  there  is  a  constraint  on  the  corresponding  variables.  r(i)  represents  the  set  of  nodes 
sharing  an  edge  with  the  node  i. 

In  the  last  twenty  years,  many  algorithms  and  heuristics  were  developed  to  find  so¬ 
lutions  in  constraint  networks  [16],  [21],  [22].  Their  number  had  increased  to  such  an 
extent  that  it  quickly  became  necessary  to  compare  their  performances  in  order  to  desig¬ 
nate  some  of  them  as  being  the  best  methods.  In  the  recent  years,  many  authors  worked 
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in  this  way  [23],  [5],  [10],  [1].  The  general  inference  drawn  from  these  works  is  that 
forward  checking  [16]  (denoted  by  FC)  or  FC-CBJ  (CBJ:  conflict  directed  backjump- 
ing  [22])  associated  with  a  “minimum  domain”  variable  ordering  heuristic  is  the  most 
efficient  strategy  to  solve  CSPs.  (It  has  been  so  repeated  that  hard  problems  are  often 
considered  as  those  that  FC-CBJ  cannot  solve  [24]).  This  can  be  considered  as  a  surpris¬ 
ing  conclusion  when  we  know  that  the  constraint  programming  community  uses  full  arc 
consistency  at  each  step  of  the  search  algorithms  [32]  and  claims  that  it  is  the  only  practi¬ 
cable  way  to  solve  large  real  world  problems  in  reasonable  time  [26].  The  contradiction 
comes  from  the  fact  that  in  the  CSP  community,  the  sample  problems  used  for  the  com¬ 
parisons  were  often  very  particular  (especially  small  or  easy  [16],  [21],  [23]),  and  the 
way  the  algorithms  were  compared  was  sometimes  incomplete  (no  procedure  maintain¬ 
ing  full  arc  consistency  involved  in  the  comparisons  [5],  [10])  or  unsatisfactory  [1].  But, 
this  apparent  contradiction  did  not  give  rise  to  other  questions  or  comments  than  Sabin 
and  Freuder’s  paper  [28],  in  which  it  was  pointed  out  that  a  procedure  Maintaining  Arc 
Consistency  during  the  search  {MAC)  could  outperform  FC  on  random  problems  around 
the  cross-over  point. 

In  this  paper,  we  try  to  convince  the  reader  that  MAC  is  not  only  more  efficient  than 
FC  to  solve  large  practical  problems,  but  it  is  also  really  more  efficient  than  FC  on  hard 
and  large  random  problems.  Afterwards,  we  introduce  an  original^  way  to  really  com¬ 
bine  different  variable  ordering  heuristics  (instead  of  just  using  a  secondary  heuristic  to 
break  ties  in  the  main  one)  and  show  its  efficiency.  Finally,  we  conjecture  that  when  a 
good  variable  ordering  heuristic  is  used,  CBJ  becomes  an  expensive  gadget  which  al¬ 
most  always  slows  down  the  search,  even  if  it  saves  a  few  constraint  checks. 

The  paper  is  organized  as  follows.  Section  2  contains  an  overview  of  the  main  previ¬ 
ous  works  on  algorithms  and  heuristics  to  solve  CSPs.  Section  3  describes  the  instance 
generator  and  the  experimental  method  used  in  the  rest  of  the  paper.  We  show  the  good 
behavior  of  MAC  in  Sect.  4.  The  new  way  to  combine  variable  ordering  heuristics  is  pre¬ 
sented  in  Sect.  5.  Section  6  shows  that  CBJ  loses  its  power  when  high  level  look-ahead 
is  performed  during  the  search.  Finally,  Sect.  7  summarizes  the  work  presented  in  this 
paper. 

2  Previous  Work 

It  has  been  noted  by  several  authors  (e.g.  [15])  that  there  are  four  choices  to  be  made 
when  searching  solutions  in  constraint  networks:  what  level  of  filtering  to  do,  which 
variable  to  instantiate  next,  what  value  to  use  as  the  instantiation,  what  kind  of  look- 
back  scheme  to  adopt. 

In  fact,  a  wide  part  of  the  CSP  community  has  been  working  for  twenty  years  to 
answer  these  questions. 

To  the  question  of  the  level  of  filtering  to  perform  before  instantiating  a  variable, 
many  papers  concluded  ihaX  forward- checking  (FC)  is  the  good  compromise  between 
the  pruning  effect  and  the  amount  of  overhead  involved  ([16],  [20],  [19],  [1]). 

®  This  approach  is  original  in  the  sense  that  it  has  never  been  published  before.  The  only  presen¬ 
tation  we  know  of  such  an  approach  is  given  in  [2]. 
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It  has  been  shown  for  a  long  time  that  in  constraint  networks,  the  order  in  which 
the  variables  are  instantiated  strongly  affects  the  size  of  the  search  space  explored  by 
backtracking  algorithms.  In  1980,  Haralick  and  Elliot  already  presented  the  “fail  first 
principle”  as  a  fundamental  idea  [16].  Following  this,  a  variety  of  static  variable  order¬ 
ing  heuristics  (SVO)  were  proposed  to  order  the  variables  such  that  the  most  constrained 
variables  are  chosen  first  (thus  respecting  the  Haralick  and  Elliot’s  principle).  They  cal¬ 
culate  once  and  for  all  an  order,  valid  during  all  the  tree  search,  according  to  which  vari¬ 
ables  will  be  instantiated.  They  are  usually  based  on  the  structure  of  the  constraint  graph. 
The  minimum  width  ordering  (minw)  is  an  order  which  minimizes  the  width  of  the  con¬ 
straint  graph  [9].  The  maximum  degree  heuristic  (deg)  orders  the  variables  by  decreas¬ 
ing  number  of  neighbors  in  the  constraint  graph  [5].  The  maximum  cardinality  ordering 
(card)  selects  the  first  variable  arbitrarily,  then,  at  each  stage,  selects  the  variable  that  is 
connected  to  the  largest  set  of  already  selected  variables  [5].  The  heuristic  proposed  by 
Haralick  and  Elliot  to  illustrate  their  principle  was  a  dynamic  variable  ordering  heuristic 
ipVO^).  They  proposed  the  minimum  domain  (dom)  heuristic,  which  selects  as  the  next 
variable  to  be  instantiated  a  variable  that  has  a  minimal  number  of  remaining  values  in  its 
domain.  It  is  a  dynamic  heuristic  in  the  sense  that  the  order  in  which  variables  are  instan¬ 
tiated  can  vary  from  branch  to  branch  in  the  search  tree.  Papers  discussing  variable  or¬ 
dering  heuristics  quickly  found  that  DVO  is  generally  better  than  SVO.  More  precisely, 
dom  has  been  considered  as  the  best  variable  ordering  heuristic  ([27],  [15],  [5]). 

The  question  of  the  choice  of  the  value  to  use  as  an  instantiation  of  the  selected  vari¬ 
able  did  not  catch  as  much  researchers’  attention  as  variable  ordering.  It  has  been  ex¬ 
plored  in  [15]  or  [6],  but  without  producing  a  simple  generic  method  proven  efficient 
and  usable  in  any  constraint  network.  Even  the  promise  selection  criterion  of  Geelen 
[14]  did  not  attract  FC  users. 

The  question  of  the  kind  of  look-back  scheme  to  adopt  had  remained  an  open  ques¬ 
tion  for  a  long  time.  Different  approaches  had  been  proposed,  but  none  had  been  elected 
as  the  best  one  (e.g.  learning  [4],  backjumping  [13],  backmarking  [12],  etc.).  This  state 
of  things  seems  to  have  finished  with  the  paper  of  Prosser  [22],  which  presented  conflict- 
directed  backjumping  (CBJ).  Indeed,  Prosser  showed  in  [23]  that  the  hybrid  algorithm 
FC-CBJ  is  the  most  efficient  algorithm  (among  many  hybrid  algorithms)  to  find  solu¬ 
tions  in  various  instances  of  the  zebra  problem. 

That’s  why,  for  a  few  years,  FC-CBJ  associated  with  the  dom  DVO  (denoted  by  FC- 
CBJ-dom)  has  been  considered  as  the  most  efficient  technique  to  solve  CSPs  (naturally 
following  the  FC  domination  of  the  eighties).  Moreover,  the  numerous  papers  studying 
“really  hard  problems”  ([24],  [30],  [7])  often  take  the  implicit  definition:  “an  hard  prob¬ 
lem  is  a  problem  hard  to  solve  with  FC-CBJ-dom”. 


Recent  Work.  Recently,  some  authors,  not  satisfied  at  all  by  the  conclusion  of  the  story 
of  search  algorithms  in  CSPs,  tried  to  improve  this  winner.  This  leads  to  the  paper  of 
Frost  and  Dechter  [11],  which  reveals  two  important  ways  to  overcome  the  classical  FC- 
CBJ-dom  algorithm. 

^  The  origin  of  the  name  DVO  is  in  [10]  to  denote  what  we  will  call  here  dom.  We  use  DVO  in 
its  general  meaning,  as  it  is  proposed  in  [1]. 
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First,  the  dom  DVO  is  not  as  perfect  as  it  seems.  When  several  domains  have  the 
same  minimal  size,  the  next  variable  to  be  instantiated  is  arbitrarily  selected.  When  the 
constraint  graph  is  sparse,  many  useful  information  on  its  structure  is  lost  by  dom,  which 
does  not  deal  with  the  constraint  graph.  In  [11],  a  solution  to  these  shortcomings  is  pro¬ 
posed  by  using  the  dom+deg  DVO:  it  consists  of  the  dom  DVO  in  which  ties  are  broken^ 
by  choosing  the  variable  with  the  highest  degree  in  the  constraint  graph.  Frost  and  Dech- 
ter  underlined  that  “this  scheme  gives  substantially  better  performance  than  picking  one 
of  the  tying  variables  at  random”. 

Second,  in  FC-CBJ-dom,  once  a  variable  is  selected  for  instantiation,  values  are  pi¬ 
cked  from  the  domain  in  an  arbitrary  fixed  order  (usually  values  are  arbitrarily  assigned  a 
sequence  number  and  are  selected  according  to  this  sequence).  In  [  1 1],  Frost  and  Dechter 
presented  various  domain  value  ordering  heuristics  {LVO  for  look-ahead  value  ordering) 
and  experimentally  showed  that  the  min-conflicts^  (me)  LVO  is  the  one  which  improves 
the  most  the  efficiency  of  FC-CB J-dom+deg  (denoted  by  FC-CBJ-dom+deg-mc). 

Another,  quite  different  way  to  improve  search  by  reordering  values  (or  variables 
and  values)  after  a  dead-end  has  been  presented  in  [17].  Its  features  making  it  especially 
suitable  to  solve  real  world  problems,  we  do  not  discuss  it  here. 


3  A  Few  Words  About  the  Experiments 


Before  starting  the  experimental  comparisons  between  different  algorithms,  we  say  a 
few  words  about  the  experimental  method  we  chose. 

When  we  want  to  work  on  random  problems,  the  first  step  is  to  choose  an  instance 
generator.  The  characteristics  of  the  generated  problems  will  depend  on  the  generator 
used  to  create  them.  The  CSP  literature  has  presented  several  generators,  always  involv¬ 
ing  four  parameters:  N  the  number  of  variables,  D  the  common  size  of  all  the  initial  do¬ 
mains,  and  two  other  parameters  concerning  the  density  of  the  constraint  graph  and  the 
tightness  of  the  chosen  constraints.  Early  generators  often  used  a  probability  pi  that  a 
constraint  exists  between  two  variables,  and  a  probability  p2  that  a  value  pair  is  forbid¬ 
den  in  a  given  constraint.  The  number  of  different  networks  that  could  be  generated  with 
the  same  four  parameters  (AT,  D,  pi ,  P2)  was  really  huge.  Networks  with  quite  different 
features  (e.g.  a  network  with  a  complete  constraint  graph  and  one  with  only  one  con¬ 
straint)  could  be  generated  with  the  same  set  of  parameters.  One  of  the  consequences  of 
this  fact  was  that  a  very  large  number  of  instances  must  be  solved  to  predict  the  behavior 
of  an  algorithm  with  a  good  statistical  validity. 

Hence,  a  new  generation  of  instance  generators  appeared  (beginning  with  [18]), 
which  replaced  the  probability  pi  to  have  a  constraint  between  two  variables  by  a  fixed 
number  C  of  constraints  [24].  In  the  same  way,  p2  can  be  replaced  by  a  number  T  of  for¬ 
bidden  value  pairs  [1 1].  In  [30],  pi  andp2  are  still  used,  but  they  represent  “proportions” 

®  The  idea  of  breaking  ties  in  SVOs  and  DVOs  had  been  previously  proposed  in  [33]. 

®  min-conflicts  considers  each  value  in  the  domain  of  the  selected  variable  and  associates  with  it 
the  number  of  values  in  domains  of  future  variables  with  which  it  is  not  compatible.  The  values 
are  then  affected  to  the  selected  variable  in  increasing  order  of  this  count.  This  is  in  fact  the  first 
LVO  presented  in  [14,  page  32]. 
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and  not  probabilities  (i.e.  if  iV=20  andpi=0. 1,  the  number  of  generated  constraints  is  ex¬ 
actly  0.1  *  (20  *  19) /2  =  19).  This  new  method  generates  more  homogeneous  networks 
and  then,  it  is  not  necessary  to  solve  a  huge  number  of  networks  for  each  set  of  param¬ 
eters.  Nevertheless,  a  particular  care  must  be  taken  in  order  to  generate  networks  with  a 
uniform  distribution.  Specifically,  the  distribution  must  be  as  follows:  out  of  all  possi¬ 
ble  sets  of  C  variable  pairs  choose  any  particular  set  with  uniform  probability,  and  for 
each  constrained  pair  out  of  all  possible  sets  of  T  value  pairs  choose  any  particular  set 
with  uniform  probability^.  We  need  an  algorithm  that  generates  uniform  random  permu¬ 
tations  of  p  elements  selected  among  k  elements  without  repetition.  Essentially,  this  is 
just  choosing  which  of  the  k  elements  will  be  the  first,  which  of  the  remaining  k  —  1 
elements  will  be  the  second,  and  so  forth. 

When  we  want  to  perform  experiments  on  randomly  generated  networks,  and  when 
the  instance  generator  has  already  been  chosen,  a  second  step  is  to  select  the  sets  of  pa¬ 
rameters  that  will  be  used  to  illustrate  the  behavior  of  the  algorithms  tested.  Each  set 
of  parameters  (N,  D,C,T)  determines  the  type  of  the  networks  generated:  N  variables 
each  having  a  domain  of  size  Z>,  C  constraints  out  of  the  N  *  [N  — 1)/2  possible,  and  T 
forbidden  value  pairs  in  each  constraint  among  the  D  *  D  possible.  In  this  paper,  we  did 
not  want  to  make  a  complete  study  of  which  sets  of  parameters  to  use  to  illustrate  our 
claims.  Thus,  we  decided  to  use  sets  of  parameters  already  presented  in  the  literature,  and 
quite  well-known.  We  chose  the  problems  presented  in  [1 1]  (some  of  them  were  already 
used  in  [10])  and  some  of  the  most  famous  experiments  used  by  Smith  and  Grant  ([30], 
[31],  [29]).  In  certain  experiments,  we  propose  some  variations  in  the  parameters  (for 
example,  increasing  domain  size  to  show  the  behavior  of  the  algorithms  on  networks 
with  larger  domains).  But,  when  we  vary  the  density  ((7)  or  the  domain  size  (D),  we 
want  to  keep  the  networks  generated  as  close  as  possible  to  the  cross-over  point  (set  of 
parameters  for  which  approximately  50%  of  the  problems  are  satisfiable  and  50%  are 
not).  So,  T  is  moved  in  order  to  stay  at  the  value  “Tco”  which  produces  50%  satisfiable 
problems  and  50%  unsatisfiable.  When  for  given  values  of  AT,  D  and  C  no  value  of  T 
(which  is  an  integer)  produces  exactly  50%  satisfiable  problems  we  always  take  as  Tco 
the  smallest  value  for  which  the  number  of  unsatisfiable  problems  is  greater  than  50%. 
These  variations  of  the  distance  between  Tco  and  the  effective  cross-over  point  explain 
the  serrated  look  of  some  of  the  curves  reported  below.  The  size  of  the  problems  tested 
in  such  cases  is  often  rather  small,  because  each  point  of  the  curves  given  (see  Fig.  2, 3, 
5)  requires  to  solve  a  large  number  of  networks  just  to  find  the  right  value  Tco. 

In  the  following  sections  we  report  different  kinds  of  measures  of  performances  for 
the  algorithms  tested.  First,  we  often  present  what  we  call  “number  of  constraint  checks”. 
The  classical  “number  of  constraint  checks”  measure  is  well-adapted  for  algorithms  like 
FC,  but  presents  some  problems  when  used  with  MAC,  which  maintains  lists  where 
some  of  the  past  constraint  checks  are  recorded.  Hence,  for  MAC,  what  we  name  “num¬ 
ber  of  constraint  checks”  is  in  fact  the  number  of  classical  constraint  checks  plus  the 
number  of  list  checks  it  performs  during  the  search.  The  second  measure  we  use  is  cpu 


^  Prosser’s  generator  [24]  does  not  choose  all  the  possible  sets  of  C  constraints  with  a  uniform 
probability.  Frost  and  Dechter’s  generator,  while  being  better  than  Prosser’s  one,  is  not  com¬ 
pletely  uniform  [8].  Although  it  is  not  extensively  described.  Smith’s  generator  seems  to  be 
uniform  [29]  (while  Smith  and  Grant’s  one  is  not  [30]). 
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time,  and  the  third  one  is  the  number  of  backtracks  performed,  i.e.  the  number  of  times 
the  algorithm  goes  backwards  in  the  search  tree. 

In  all  the  tables  below,  we  generated  and  solved  100  instances  for  each  tested  Frost 
and  Dechter’s  set  of  parameters.  In  all  the  figures  (curves),  we  limited  this  number  to  50 
instances  for  each  tested  value  of  the  varying  parameter. 

We  always  report  mean  performances  on  the  number  of  instances  solved  for  a  set  of 
parameters.  Indeed,  we  think  that  reporting  the  median  cost  is  questionable  when  the  set 
of  parameters  is  near  the  cross-over  point:  unsatisfiable  instances  are  generally  harder 
to  solve  than  satisfiable  ones,  so  the  median  will  appear  in  a  region  where  few  problems 
fall  into,  involving  a  low  representativity  of  this  measure.  In  the  extreme  case,  we  can 
imagine  a  set  of  parameters  for  which  50  problems  are  found  satisfiable  in  1  second  and 
50  are  found  unsatisfiable  in  10  seconds:  what  is  the  median  cost  of  this  experiment? 

LVOs  being  outside  the  scope  of  the  present  paper,  we  just  checked  that  me  was  a 
significant  improvement  in  our  experiments  compared  to  the  versions  of  the  algorithms 
written  without  LVO.  Hence,  in  the  results  presented  in  the  next  sections,  me  has  always 
been  used,  even  if  on  some  instances  the  promise  LVO  of  Geelen  can  have  a  slight  more 
interesting  behavior  than  me.  However,  after  a  very  rough  comparison,  we  could  not 
select  a  winner. 

Finally,  we  want  to  point  out  that  the  programs  used  to  perform  the  experiments  of 
this  paper  are  available  via  the  ftp  site  ftp .  lirmm .  f  r. 


4  MAC  is  Better  than  FC-CBJ 


We  said  in  Sect.  2  that  FC-CBJ  is  considered  as  the  best  algorithm  to  find  solutions 
in  constraint  networks.  In  fact,  in  the  papers  that  have  compared  algorithms  with  dif¬ 
ferent  levels  of  filtering  during  search  and  that  have  concluded  that  FC  performs  the 
right  amount  of  filtering  it  is  often  specified  that  this  claim  is  stated  with  respect  to  the 
tested  problems  [16],  [20],  [23].  The  tested  problems  were  often  the  n-queens,  very  small 
random  problems  not  necessarily  chosen  in  the  phase  transition,  or  the  zebra  problem. 
Therefore,  we  can  conclude  that  on  very  easy  or  very  small  problems  FC  is  probably  the 
algorithm  which  performs  the  right  amount  of  filtering  (pure  look-back  algorithms  are 
probably  definitively  overcome  [23],  [1]). 

But,  Dechter  and  Meiri  already  said  that  “it  is  conceivable  that  on  larger,  more  dif¬ 
ficult  instances,  intensive  preprocessing  algorithms  may  actually  pay  off”  [5].  A  first 
confirmation  appeared  in  the  paper  of  Sabin  and  Freuder  [28],  in  which  they  showed  that 
MAC  can  outperform  FC  on  hard  instances  of  CSPs.  The  good  performances  of  MAC 
on  large  radio  link  frequency  assignment  problems  (where  FC  was  thrashing)  provide 
another  confirmation  [3]. 

Recently,  Smith  agreed  that  “exceptionally  hard  problems  ought  more  properly  to 
be  called  problems  which  the  particular  search  algorithm  we  are  using  finds  exception¬ 
ally  hard”.  This  led  her  and  Grant  to  study  the  behavior  of  MAC  on  problems  found 
exceptionally  hard  with  FC-dom  [31].  Their  conclusion  is  that  “in  most  cases,  the  MAC 
algorithm  can  show  that  the  problem  is  arc  inconsistent,  and  so  detects  that  it  is  insoluble 
without  searching  it”. 
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Finally,  [1]  is  the  only  paper  which  clearly  gives  the  advantage  to  FC-CBJ  against 
algorithms  performing  arc  consistency  at  each  node  of  the  search  tree  after  an  experi¬ 
mentation  on  non-easy  problems.  But,  after  discussion  with  Bacchus,  it  appears  that  in 
his  paper,  the  algorithm  that  performs  arc  consistency  at  each  stage  of  the  search  uses  a 
kind  of  AC-0  algorithm,  i.e.  an  AC-1  algorithm  which  does  not  take  care  of  the  structure 
of  the  constraint  graph,  checking  all  the  variable  pairs,  as  if  the  network  was  always  a 
complete  graph.  So,  we  cannot  take  these  results  into  account. 

We  showed  in  Sect.  2  that  the  behavior  of  FC-CBJ  can  be  improved  by  using  a  DVO 
proposed  by  Frost  and  Dechter,  dom+deg,  and  by  using  a  good  LVO  as  me.  In  this  sec¬ 
tion,  we  will  show  that,  even  associated  to  the  dom+deg  DVO  and  the  me  LVO,  FC-CBJ 
can  no  longer  be  considered  as  the  best  algorithm  to  solve  CSPs.  We  will  experimentally 
show  that  FC  has  a  too  weak  pruning  effect  to  be  the  most  efficient  on  relatively  hard 
problems.  A  search  procedure  as  MAC,  with  a  more  intensive  filtering  mechanism,  is 
more  efficient  to  find  solutions  on  hard  and  large  problems,  in  which  the  overhead  due 
to  arc  consistency  is  outweighed  by  its  gain. 

The  experiments  of  this  section  are  limited  to  the  comparison  of  FC-CBJ-dom+deg- 
mc  and  MAC-dom+deg-mc.  FC-CBJ-dom+deg-mc  is  the  algorithm  stated  to  be  the 
best  in  Sect.  2.  MAC-dom+deg-mc  is  here  a  classical  MAC  procedure  [28]  in  which 
the  arc  consistency  algorithm  used  is  AC-7  [3].  The  DVO  and  the  LVO  used  are  the  same 
in  the  two  algorithms. 


Table  1.  FC-CBJ-dom+deg-mc  and  MAC-dom+deg-mc  performances  on  problems  generated 
with  Frost  and  Dechter ’s  sets  of  parameters  [11].  “arc-inc”  in  the  backtrack  ratio  column  means 
that  all  the  problems  generated  for  a  given  set  of  parameters  were  arc-inconsistent,  implying  an 
infinite  ratio  (MAC  detects  arc-inconsistency  without  any  backtrack). 


Parameters 

# constraint  checks 

epu 

seconds 

#  backtracks 

N,D,C,  TID*D 

FC-CBJ 

MAC 

ratio 

FC-CBJ  MAC 

ratio 

ratio 

#1 

35,6,501,4/36 

506,265 

330,717 

1.53 

6.83 

2.66 

2.56 

7.45 

#2 

35,9,178,27/81 

248,414 

156,131 

1.59 

3.26 

1.00 

3.25 

14.29 

#3 

50,6,325,8/36 

412,505 

152,197 

2.71 

5.81 

1.29 

4.50 

17.35 

#4 

50,20,95,300/400 

565,330 

273,537 

2.07 

7.11 

1.62 

4.39 

37.02 

#5 

100,12,120,110/144 

243,766 

15,709 

15.52 

3.79 

0.14 

25.99 

870.28 

#6 

125,3,929,1/9 

271,557 

44,862 

6.05 

4.51 

1.52 

2.96 

12.08 

#7 

250,3,391,3/9 

19,636 

2,686 

7.31 

0.55 

0.05 

11.26 

arc-inc 

#8 

350,3,524,3/9 

820,368 

3,558  230.53 

31.04 

0.07  476.31 

arc-inc 

#9 

350,3,2292,1/9 

426,713 

51,176 

8.34 

9.40 

4.35 

2.16 

9.68 

A  first  set  of  experiments  (in  which  parameters  are  taken  from  [1 1])  is  given  in  Table 
1.  The  columns  “ratio”  represent  how  much  MAC-dom+deg-mc  was  better  than  FC- 
CBJ-dom+deg-mc  with  respect  to  the  associated  measure  (mean  number  of  constraint 
checks,  mean  epu  time,  mean  number  of  backtracks).  On  this  first  set  of  experiments  we 
can  stress  that  the  ratio  of  the  number  of  constraint  checks  is  less  advantageous  for  MAC 
than  the  epu  time  ratio.  An  explanation  is  that,  for  any  search  algorithm  that  performs 
some  look-ahead  filtering,  each  backtrack  point  involves  restoring  the  previous  state, 
and  running  again  the  variable- value  selection.  In  spite  of  being  free  of  any  constraint 
check,  this  process  is  time  consuming.  MAC-dom+deg-mc  being  better  and  better  than 
FC-CBJ-dom+deg-mc  in  number  of  backtracks  (see  the  last  column  of  Table  1)  saves 


68 


a  lot  of  time  in  addition  to  the  time  saved  by  constraint  checks  savings.  Anyway,  MAC- 
dom+deg-mc  significantly  overcomes  FC-CBJ-dom+deg-mc  on  these  problems. 

We  performed  a  second  set  of  experiments  on  the  now  classical  (50, 10, 0.1,  P2)  set 
of  parameters  of  Smith  and  Grant  [30],  [31].  In  our  formalism,  it  consists  of  the  set  of 
parameters  (50, 10, 123,  T) .  Figure  1  gives  the  results,  which  corroborate  those  obtained 
in  Table  1 .  MAC  is  slightly  worse  than  FC-CBJ  on  easy  problems  (under-  and  over- 
constrained)  while  being  much  better  around  the  cross-over  point. 


Fig,  1.  FC-CBJ-dom+deg-mc  and  MAC-dom+deg-mc  time  performances  on  the 
(50, 10, 123,  T)  experiment  of  Smith  and  Grant  [30]. 


Frost-Dechter  and  Smith-Grant’s  parameters  being  limited  to  small  domain  sizes, 
we  took  the  (50, 20, 95, 300)  set  of  parameters  in  Frost  and  Dechter’s  sample,  and  chan¬ 
ged  domain  sizes  while  keeping  N  and  C  fixed  at  50  and  95  respectively,  T  varying  to 
stay  at  Tco  (see  Fig.  2-(left)).  We  note  that  the  more  D  grows,  the  more  MAC-dom+deg- 
mc  outperforms  FC-CBJ-dom+deg-mc,  going  from  3  times  faster  when  D  is  smaller 
than  10  to  26  times  faster  when  D  reaches  40. 

Finally,  we  wanted  to  see  the  behavior  of  MAC  when  the  density  of  the  constraint 
graph  increases.  Figure  2-(right)  presents  the  FC-CBJ-dom+deg-mc  to  MAC-dom+ 
deg-mc  cpu  time  ratio  when  the  number  C  of  constraints  increases  in  the  (30, 10,  (7,  Tco) 
set  of  parameters.  MAC  efficiency  increases  till  the  constraint  graph  contains  approxi¬ 
mately  a  third  of  the  possible  number  of  constraints.  Afterwards,  FC-CBJ  becomes  less 
and  less  worse  as  the  number  of  constraints  grows  till  the  complete  graph®.  This  phe¬ 
nomenon  was  pointed  out  by  Sabin  and  Freuder. 

®  These  cpu  times  ratios,  despite  showing  the  advantage  of  MAC,  do  not  go  higher  than  3.  The 
reason  is  that  30  variables  is  not  enough  to  generate  hard  problems  on  which  MAC  would  show 
its  real  efficiency. 
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3.2 
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Fig.  2.  FC-CBJ-dom+deg-mc  to  MAC-dom+deg-mc  cpu  time  ratio  on  the  (50,  D,  95,  Tco) 
(left),  D  growing  from  6  to  40;  and  on  the  (30, 10,  C,  Tco)  (right),  where  C  grows  from  29  to  435 
(complete  graph). 


5  Combined  DVOs:  dom/deg 


In  Sect.  2  we  presented  different  kinds  of  variable  ordering  heuristics  and  said  that  the 
dom  DVO  had  been  considered  for  a  long  time  as  the  best  one.  However,  when  the  con¬ 
straint  graph  is  sparse,  many  useful  information  is  lost  by  this  heuristic  while  it  is  caught 
by  the  SVOs  based  on  the  structure  of  the  constraint  graph. 


Fig,  3.  Different  variable  ordering  heuristics  tested  with  MAC  on  the  (20, 10,  C,  Tco),  where  C 
grows  from  40  to  190  (complete  graph).  Each  graph  represents  the  ratio  of  the  mean  number  of 
backtracks  of  MAC  with  the  given  heuristic  to  the  sum  of  the  mean  number  of  backtracks  of  the 
four  algorithms  tested  (absolute  results  would  have  given  unreadable  graphs  since  the  difficulty 
of  the  problems  significantly  grows  when  C  grows). 


In  Fig.  3,  where  random  problems  with  increasing  density  are  solved  by  different  ver- 
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sions  of  MAC  (i.e.  using  different  variable  ordering  heuristics^),  it  is  shown  that  dom 
can  be  a  very  poor  heuristic  at  low  densities,  while  deg  is  very  efficient  on  the  same 
problems.  Inversely,  when  the  constraint  graph  becomes  dense,  deg  goes  blind  while 
dom  becomes  clever,  dom+deg,  which  breaks  ties  in  dom  by  using  the  degree  of  the 
tying  variables  is  shown  in  this  Fig.  3  to  improve  dom  on  problems  where  it  was  bad. 
But,  in  dom+deg,  the  size  of  the  domains  clearly  have  the  main  influence  on  the  order¬ 
ing,  the  degree  of  variables  being  only  used  in  cases  where  ties  are  found.  To  avoid  this 
drawback,  which  prevents  dom+deg  from  being  as  good  as  deg  in  sparse  constraint 
networks,  we  propose  to  really  combine  dom  and  deg  to  obtain  a  new  DVO  in  which 
deg  is  as  influent  as  dom.  This  new  DVO,  dom/ deg,  selects  as  the  next  variable  to  be 
instantiated  a  variable  that  has  the  smallest  ratio:  size  of  the  remaining  domain  to  degree 
of  the  variable  (i.e.  a  variable  v  minimizing  \Dv\/\r{v)\).  In  Fig.  3  we  have  a  first  idea 
of  its  behavior:  it  has  the  behavior  of  dom+deg  in  networks  where  dom  was  good,  and 
the  one  of  deg  in  networks  where  deg  was  better.  These  first  results  being  promising, 
we  give  in  Table  2  and  Fig.  4  a  more  complete  set  of  experiments  in  which  we  com¬ 
pare  MAC-dom+deg~mc  and  MAC-dom/deg-mc.  Once  again,  the  characteristics  of 
the  problems  tested  are  taken  from  [11]  and  [30].  Results  obtained  in  Table  2  show  that 
with  small  domain  sizes  (D  <  10)  the  two  DVOs  have  similar  behaviors,  with  a  little 
advantage  for  dom/ deg.  The  difference  is  slightly  perceptible  on  the  {35, 9, 178, 27) 
and  the  (100, 12, 120, 110)  experiments.  It  is  significant  on  the  (50, 20, 95, 300).  This  is 
explained  by  the  fact  that  when  D  is  very  small,  dom/  deg  and  dom+deg  are  quite  sim¬ 
ilar  criteria,  the  variations  of  |Dt,|  -for  a  given  variable  v-  dominating  those  of  \r(v)\ 
in  dom/deg. 


Table  2.  MAC-dom+deg-mc  versus  MAC-dom/deg-mc.  Only  ratios  are  given  (real  values 
can  be  obtained  from  these  ratios  and  Table  1).  Values  greater  than  1  mean  dom/deg  is  better, 
values  smaller  than  1  mean  dom+deg  is  better. 


Parameters 

N,  D,  C,  T/D  *  D 

ratios 

#constraint  checks 

time 

#backtracks 

#] 

35,6,501,4/36 

1.00 

1.01 

1.35 

#2 

35,9,178,27/81 

1.24 

1.23 

1.63 

#3 

50,6,325,8/36 

1.11 

1.12 

1.53 

#4 

50,20,95,300/400 

3.45 

3.05 

7.01 

#5 

100,12,120,110/144 

1.11 

1.10 

3.20 

#6 

125,3,929,1/9 

1.02 

0.98 

1.42 

#7 

250,3,391,3/9 

1.00 

1.00 

arc-inc 

#8 

350,3,524,3/9 

1.00 

1.00 

arc-inc 

#9 

350,3,2292,1/9 

1.00 

0.97 

1.56 

To  be  convinced  that  dom/deg  is  more  advantageous  when  domains  are  larger,  we 
tested  the  two  heuristics  on  instances  of  problems  with  increasing  domain  size.  In  Fig. 
5-(left),  the  domain  sizes  vary  while  N  and  C  are  fixed  to  50  and  95  respectively.  T 
changes  so  that  problems  are  always  on  the  cross-over  point.  The  more  the  size  of  the  do¬ 
mains  increases,  the  more  MAC-dom/deg-mc  overcomes  MAC-dom+deg-mc,  go¬ 
ing  from  once  to  7  times  faster  when  D  grows  from  6  to  40.  Furthermore,  Prosser  (per- 

^  The  LVO  used  is  me  in  all  these  versions.  Without  LVO,  we  remarked  that  the  differences  in¬ 
crease  between  good  and  bad  algorithms. 
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Fig.  4.  MAC-dom+deg-mc  and  MAC-dom/deg-mc  on  the  (50, 10, 123,  T). 


sonal  communication)  has  pointed  out  that  when  initial  domain  sizes  are  not  all  equal, 
dom  (or  dom+deg)  can  be  fooled  by  these  initial  differences.  We  suppose  that  in  these 
cases  dom/ deg  would  be  even  more  interesting. 


Fig.  5.  MAC-dom+deg-mc  versus  MAC-dom/deg-mc  on  the  (50,  £),  95,  Tco)  (left),  and 
MAC-CBJ-dom/deg-mc  versus  MAC-dom/deg-mc  on  the  (50,  D,  95,  Tco)  (right). 


Thus,  we  can  conclude  that  combining  different  DVOs  is  a  promising  approach.  We 
have  tested  other  combined  DVOs  not  presented  in  this  paper.  The  one  that  can  be  named 
dom/ card,  in  which  the  number  of  previously  assigned  neighbors  of  the  variable  re¬ 
places  the  total  number  of  neighbors  in  the  ratio  seems  to  be  quite  worse  than  dom /deg 
(when  card  alone  was  considered  as  a  better  SVO  than  deg  alone  [5]).  On  the  other 
hand,  when  the  ratio  involves  the  number  of  not  yet  assigned  neighbors  of  the  variable, 
the  performances  are  roughly  similar  to  those  obtained  with  dom/ deg,  sometimes  bet- 
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ter,  sometimes  worse,  dom/deg  has  been  also  implemented  in  FC-CBJ.  We  saw  an  im¬ 
provement  with  respect  to  dom+deg,  but  smaller  than  the  one  observed  on  MAC. 

6  CBJ  Becomes  Useless 

We  have  shown  that  using  MAC  instead  of  FC  as  the  filtering  scheme  was  worthwhile  on 
hard  and  large  problems.  If  we  follow  the  evolution  of  FC  in  FC-CBJ  we  should  now  use 
MAC-CBJ  [25].  But,  let  us  recall  a  sentence  found  in  [16]:  “Look  ahead  to  the  future  in 
order  not  to  worry  about  the  past”.  In  fact,  some  authors  remarked  that  if  we  use  a  good 
variable  ordering  heuristic  “CBJ  is  unlikely  to  generate  large  backjumps,  and  its  savings 
are  likely  to  be  minimal”  because  “variables  that  have  conflicts  with  past  assignments 
are  likely  to  be  instantiated  sooner”  [1].  In  [30],  Smith  and  Grant  said  that  “for  most 
problems,  the  ordering  given  by  dom  ensures  that  chronological  backtracking  usually 
results  in  backtracking  to  the  real  culprit  for  a  failure,  so  that  informed  backtracking  does 
not  add  very  much”. 

These  statements,  done  in  the  case  of  FC-dom  were  probably  too  optimistic  since 
a  non  negligible  number  of  problems  are  easily  solved  by  FC-CBJ-dom  when  FC-dom 
is  thrashing  [31].  But,  as  it  is  suggested  by  Haralick  and  Elliot’s  sentence,  the  more  we 
will  perform  look-ahead,  the  less  we  will  have  to  worry  about  looking  back.  CBJ  was 
a  strong  improvement  on  BT  (simple  backtracking),  FC-CBJ  can  be  an  improvement 
on  FC  on  hard  problems,  MAC-CBJ  cannot  simply  be  claimed  to  be  an  improvement 
on  MAC.  In  [31],  while  a  lot  of  problems  were  found  on  which  FC-CBJ-dom  outper¬ 
formed  FC-dom  by  at  least  one  order  of  magnitude,  only  one  instance  was  found  on 
which  MAC-CBJ-dom  significantly  outperformed  MAC-dom.  If  we  consider  now  the 
DVO  dom/deg  in  place  of  dom,  there  are  even  more  reasons  to  think  that  CBJ  becomes 
useless  (since  dom/deg  has  been  shown  smarter  than  dom).  Furthermore,  the  more  the 
amount  of  filtering  involved  in  a  search  procedure  is  high,  the  more  the  overhead  caused 
by  CBJ  is  heavy  [25].  CBJ  was  cheap  to  incorporate  in  BT,  it  was  not  prohibitive  in  FC, 
but  it  palpably  slows  down  the  search  in  MAC.  Hence,  a  significant  number  of  constraint 
checks  must  be  saved  to  outweigh  this  overhead. 


Table  3.  MAC-CBJ-dom/ deg -me  versus  MAC-dom/ deg -me. 


Parameters 

N,D,C,  TID*D 

ratios 

#  constraint  checks 

time 

#  backtracks 

#I 

35,6,501,4/36 

0.99 

1.17 

0.99 

#2 

35,9,178,27/81 

0.99 

1.32 

0.99 

#3 

50,6,325,8/36 

0.99 

1.21 

0.99 

#4 

50,20,95,300/400 

0.99 

1.33 

0.99 

#5 

100,12,120,110/144 

0.98 

0.99 

0.96 

#6 

125,3,929,1/9 

0.97 

1.08 

0.96 

#7 

250,3,391,3/9 

arc-inc 

arc-inc 

arc-inc 

#8 

350,3,524,3/9 

arc-inc 

arc-inc 

arc-inc 

#9 

350,3,2292,1/9 

0.64 

0.70 

0.61 

Table  3  gives  the  comparison  of  MAC-CBJ-dom/ deg -me  and  MAC-dom/ deg- 
mc  on  the  Frost  and  Dechter’s  problems.  On  the  problems  #1  to  #8  the  result  is  easy  to 
read:  CBJ  leads  to  a  few  constraint  checks  savings  which  are  not  sufficient  to  make  good 
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the  loss  of  time.  But,  on  the  set  of  parameters  #9,  there  is  a  significant  gain  for  MAC- 
CBJ-dom/deg-mc.  If  we  focus  on  the  100  instances  which  form  this  experiment  we  see 
that  on  99  instances  MAC-dom/deg-mc  and  MAC-CBJ-dom/deg-mc  have  almost 
the  same  behavior,  solving  the  problem  in  less  than  1  second  with  a  number  of  back¬ 
tracks  smaller  than  1000.  But  on  one  of  the  100  instances  MAC-dom/deg-mc  needs 
137  seconds  and  41,639  backtracks  to  find  a  solution  when  MAC-CBJ-dom/deg-mc 
only  needs  73  seconds  and  20,069  backtracks.  The  mean  performances  are  strongly  in¬ 
fluenced  by  this  single  instance  which  seems  to  match  with  the  definition  of  “excep¬ 
tionally  hard  problems”  {ehps)  [30].  Indeed,  it  occurs  in  the  region  where  almost  all 
problems  are  soluble  (2547  constraints  are  necessary  to  be  at  the  cross-over  point  in  the 
(350, 3,  (7, 1)  set  of  parameters  [11]).  But,  as  opposed  to  the  ehps  found  in  [31],  where 
FC-CBJ  or  MAC-CBJ  were  orders  of  magnitude  faster  than  FC  or  MAC,  MAC-CBJ- 
dom/deg-mc  is  only  twice  faster  than  MAC-dom/deg-mc  on  our  ehp.  Further  ex¬ 
periments  should  probably  be  done  to  see  whether  ehps  could  be  found  on  which  MAC- 
CBJ-dom/deg-mc  is  really  better  than  MAC-dom/deg-mc,  though  we  did  not  find 
any  in  all  the  experiments  we  performed  on  smaller  networks  (50  variables). 

Finally,  we  want  to  recall  that  the  more  domain  sizes  increase,  the  more  the  length  of 
the  jumps  performed  by  CBJ  decreases  while  CBJ  time  overhead  increases  (see  the  CBJ 
mechanism  in  [22]).  This  is  confirmed  in  Fig.  5-(right)  where  MAC-CBJ-dom/deg-mc 
and  MAC-dom/deg-mc  are  compared  on  the  (50,  D,  95,  Tco)  experiment  with  increas¬ 
ing  U. 

Therefore,  except  on  sparse  networks  with  small  domain  sizes  where  more  studies 
should  be  done,  we  think  we  can  conclude  that  including  CBJ  in  MAC-dom/deg-mc 
has  more  chances  to  slow  down  the  search  of  at  least  20%  cpu  time  than  to  speed  it  up. 

7  Conclusion 

After  a  recall  of  the  story  of  search  procedures  in  constraint  networks,  this  paper  has 
shown  how  MAC  can  outperform  FC  and  FC-CBJ  on  relatively  hard  and  large  randomly 
generated  instances  of  constraint  networks.  Once  the  superiority  of  MAC  has  been  pro¬ 
ven,  we  have  proposed  a  new  kind  of  variable  ordering  heuristic,  dom/deg,  which  re¬ 
ally  combines  information  on  domain  sizes  and  constraint  graph  structure.  We  proved 
its  efficiency  when  compared  with  dom+deg,  the  most  efficient  previous  heuristic.  The 
total  gain  involved  by  these  two  techniques  (MAC  and  dom/deg)  is  summarized  in  Ta¬ 
ble  4.  The  ratios  of  the  mean  performances  of  FC-CBJ-dom+deg-mc  to  the  mean  per¬ 
formances  of  MAC-dom/deg-mc  are  presented.  The  tested  problems  are  again  Frost 
and  Dechter’s  problems.  The  benefit  is  always  significant.  Furthermore,  we  must  have 
in  mind  that  with  larger  domains  the  gain  is  greater  and  greater. 

Therefore,  we  can  conclude  that  on  relatively  hard  and  large  instances  of  random 
problems,  MAC  and  our  new  variable  ordering  heuristic  are  more  efficient  than  FC-CBJ 
and  classical  dom  or  dom+deg  DVOs. 

Finally,  we  have  shown  in  the  last  section  that  performing  CBJ  is  almost  always  use¬ 
less  when  combined  with  a  procedure  achieving  as  much  look-ahead  as  MAC-dom/deg- 
mc.  The  time  overhead  is  too  heavy  to  be  outweighed  hy  the  small  number  of  constraint 
checks  and  backtracks  saved. 
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Table  4.  FC-CBJ-dom+deg-mc  versus  MAC-dora/deg-mc. 


Parameters 

N,  D,  C,  T/D  *  D 

ratios 

#constraint  checks 

time 

^backtracks 

#1 

35,6,501,4/36 

1.54 

2.58 

10.03 

#2 

35,9,178,27/81 

1.97 

3.98 

23.33 

#3 

50,6,325,8/36 

3.00 

5.03 

26.55 

#4 

50,20,95,300/400 

7.13 

13.38 

259.64 

#5 

100,12.120,110/144 

17.29 

28.69 

2785.20 

#6 

125,3,929,1/9 

6.15 

2.91 

17.15 

#7 

250,3,391,3/9 

7.31 

11.26 

arc-inc 

#8 

350,3,524,3/9 

230.53 

476.31 

arc-inc 

#9 

350,3,2292,1/9 

8.35 

2.10 

15.10 
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Abstract 

We  investigate  a  class  of  set  constraints  that  is  used  for  the  type  analysis  of  con¬ 
current  constraint  programs.  Its  constraints  are  inclusions  between  first-order 
terms  (without  set  operators)  interpreted  over  non-empty  sets  of  finite  trees. 
We  show  that  this  class  has  the  independence  property.  We  give  a  polynomial 
algorithm  for  entailment.  The  independence  property  is  a  fundamental  property 
of  constraint  systems.  It  says  that  the  constraints  cannot  express  disjunctions, 
or,  equivalently,  that  negated  conjuncts  are  independent  fi*om  each  other.  As 
a  consequence,  the  satisfiability  of  constraints  with  negated  conjuncts  can  be 
directly  reduced  to  entailment. 


1  Introduction 

In  this  paper  we  show  that  a  class  of  set  constraints  that  is  used  for  the  type  analysis  of 
concurrent  constraint  programs  has  the  independence  property  and  give  a  polynomial 
entailment  algorithm  for  this  class.  Below  we  introduce  the  property  and  the  class, 
and  then  we  are  able  to  state  the  results  more  precisely. 

Independence  property.  A  constraint  system  has  the  independence  property  if 
the  constraints  cannot  express  disjunctions.  Formally,  given  the  constraints  7  and 
•  •  •  j  the  implication  7  — >  V  . . .  V  is  valid  iff  one  of  the  n  implications 
ry  ^  ^  is  valid.  An  equivalent  formulation  of  the  property  states  that  the 

satisfiability  problem  of  a  conjunction  with  any  number  of  negated  constraints  can  be 
reduced  to  “independent”  sub>problems  with  exactly  one  negated  constraint.  Namely, 
the  conjunction  7  A  A  ...  A  -i^n  is  satisfiable  iff  the  n  conjunctions  7  A  -1^1,  . . . , 
7A“^^n  aJ”®  satisfiable.^  As  a  direct  algorithmic  consequence,  the  satisfiability  problem 
of  the  conjunction  of  7  with  n  negated  constraints  can  be  reduced  to  n  entailment 
problems  (t.e.,  the  dual  of  the  validity  of  the  n  implications  7  ->  9?i,  . . . ,  7  ^n)- 

The  independence  property  is  a  fundamental  notion.  In  the  context  of  constraint 
logic  programming  (CLP),  it  has  made  possible  the  manipulation  of  inequations  [9]. 

•On  leave  from  University  of  Wroclaw,  Poland.  Partially  supported  by  KBN  grant  8  S503  022  07 
and  by  Foimdation  for  Polish  Science. 

^Constraints  are  closed  under  conjunction;  thus,  we  may  use  the  constraint  7  for  the  conjunction 
of  all  positive  constraints. 
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The  property  also  characterizes  (and  is  characterized  by)  the  semantics  of  bottom-up 
and  top-down  computations  [22],  A  general  study  of  the  property  shows  its  importance 
in  various  symbolic  computation  areas  [20].  In  constraint  data  bases,  the  property 
allows  the  efficient  containment  test  between  constraint  relations  [17].  The  property 
is  necessary  for  the  inference  of  constrained  functional  dependencies  in  polynomial 
time  [23]. 

Examples  of  constraint  systems  with  the  independence  property  are:  universal 
closures  of  the  definite  Horn  clauses  of  a  logic  programming  language  [20],  term  equa¬ 
tions  over  finite  or  infinite  trees  [9],  linear  equations  over  the  real  numbers  [19],  various 
constraint  classes  over  feature  trees  [5,  4,  28],  infinite  Boolean  algebras  with  positive 
constraints  [15],  and  various  simple  subclasses  of  constraints  (e.^.,  inequations  over 
numbers  where  each  variable  always  appears  on  the  same  side  of  the  inequation)  which 
may  be  useful  for  the  application  considered  in  [23]. 

Set  constraints.  Generally,  set  constraints,  t.e.,  inclusions  between  terms  formed 
by  first-order  function  symbols  and  set  operators  and  interpreted  over  sets  of  finite 
trees  (see  [1,  3,  6,  7,  8,  10,  12,  13,  14,  18]),  do  not  have  the  independence  property. 
For  example,  if  /  is  binary  and  0  denotes  the  empty  set,  then  the  equivalence 

f{x,y)  C0<^(xC0)V{y  CO) 

holds  but  neither  of  the  implications  f{x,y)  C  0  x  C  0  and  f(x,y)  C  0  y  C  0 
holds.^  In  fact,  every  inclusion  between  two  terms  with  the  same  function  symbol  of 
arity  n  >  2  expresses  a  disjunction,  since  the  equivalence 

/(wi, . . . ,  «n)  C  f{vu  .  . . ,  Un)  (ui  C  0)  V  .  . .  V  (tZn  C  0)  V  (ui  c  Vi  A  .  . .  A  c  Un) 

holds  (the  inclusion  /(ui,  ...,«„)  C  f(vi, . . , ,  Un)  entails  none  of  the  disjuncts  if  and 
only  if  the  arity  n  is  strictly  greater  than  1). 

The  two  counter  examples  cited  above  can  be  avoided  if  we  restrict  the  domain 
of  interpretation  to  only  non-empty  sets  of  trees.  The  question  arises  for  which  nat¬ 
ural  syntactic  subclasses  of  set  constraints  this  semantic  restriction  is  sufficient  to 
obtain  the  independence  property.^  We  start  investigating  the  question  by  consider¬ 
ing  the  possibly  simplest  class  of  set  constraints.  This  class  has  been  introduced  as 
the  constraint  system  INES  and  investigated  in  [25], 

INES.  We  obtain  the  constraint  system  iNES  if  we  take  as  constraint  formulas 
conjunctions  of  inclusions  ti  C  t2  between  first-order  terms  (without  set  operators) 
and  interpret  them  over  the  domain  of  non-empty  sets  of  trees.  INES  is  motivated 
by,  and  used  for,  type  analysis  problems  in  concurrent  constraint  programming  CCP 
languages  [21,  26],  in  particular  Oz  [27].  The  non-emptiness  is  required  for  that  ap¬ 
plication.  We  borrow  the  explanation  for  this  fact  from  [25]:  In  CCP  languages, 
execution  proceeds  by  adding  conjuncts  to  a  global  constraint  store.  This  store  thus 
grows  monotonically  during  the  whole  execution  of  a  concurrent  program  (i.  e.,  there 

®As  Aiken  et.al.  remark  in  [1],  “this  [equivalence]  allows  the  system  to  encode  nondeterministic 
choices,  which  raises  the  complexity  to  NEXPTIME.”  In  fact,  we  associate  an  algorithmic  intuition 
with  the  presence  of  the  independence  property.  Namely,  that  it  is  the  “deep”  reason  for  the  existence 
of  a  “deterministic”  test  of  satisfiability  (for  example  the  one  in  [25]). 

®We  observe  that  if  union  is  one  of  the  set  operators,  we  can  find  counter  examples.  Indeed,  if 
fi  are  ground  terms,  the  inclusion  x  C  U . . .  Ufn  entails  the  disjunction  ti  C  xV ..  .Wtn  Q  x 

(but  none  of  the  disjuncts). 
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is  no  backtracking  on  this  level).  Therefore,  the  addition  of  an  inconsistent  conjunct 
amounts  to  a  programming  error.  The  detection  of  potential  errors  of  this  kind  at 
compile  time  is  possible  by  approximating  the  constraint  store  with  an  INES  con¬ 
straint  and  testing  ip  for  satisfiability.  For  example,  the  execution  of  the  following 
Oz  program 

proc  {P  Y}  Y=a  end 
proc  {Q  Z}  Z=b  end 
{P  X} 

{Q  X} 

leads  to  the  inconsistent  constraint  store  X  —  Y  aY  ~  a  A  X  =  Z  A  Z 
and  thus  to  an  error.  We  infer  from  the  program  the  set  constraint  X  CY  AY  —  a 
AX  C  Z  a  Z  =  b,  which  is  unsatisfiable  if  and  only  if  the  interpretation  domain  is 
restricted  to  non-empty  sets.  This  part  of  the  type  analysis  system  is  implemented 
and  used  experimentally  for  Oz  programs  [24],  As  explained  in  [25],  iNES  has  a  po¬ 
tential  interest  also  for  the  analysis  of  CLP  and  functional  programming  languages. 

Result.  The  main  result  of  this  paper  is  that  iNfes  has  the  independence  property. 
In  order  to  avoid  pathological  cases,  we  assume  that  the  set  of  function  symbols  is 
infinite,  as  is  common  in  the  context  of  independence  results  (see,  e.g.  [5,  4,  28]).  The 
practical  reading  of  the  assumption  is  that,  when  using  the  constraints  for  program 
analysis,  the  analysis  refers  to  all  possible  extensions  of  the  program,  with  an  exten¬ 
sible  set  of  identifiers.  In  a  system  with  incremental  compilation,  and  for  modular 
program  analysis,  the  assumption  seems  natural. 

We  also  derive  a  polynomial  algorithm  for  testing  entailment  between  iNES  con¬ 
straints.  Thus,  we  obtain  a  polynomial  test  of  the  satisfiability  of  conjunctions  of 
positive  and  negated  constraints.^* 

The  only  algorithms  known  previously  for  the  two  problems  of  entailment  and 
satisfiability  with  negated  constraints  are  for  the  general  case  of  set  constraints  with 
set  operators  and  with  negation  [2,  11,  7];  their  complexity  is  NEXPTIME  [7]. 

Structure  of  the  paper.  In  the  next  section,  we  give  some  intuition  about  the 
proof  of  our  main  result.  We  then  fix  the  syntax  and  the  semantics  of  our  class  of  set 
constraints  formally  and  state  the  main  theorems.  In  Section  4  we  present  the  full 
proof.  We  end  with  a  conclusion  section. 


2  Overview  of  the  proof 

The  proof  of  our  independence  result  is  interesting  in  its  own  right.  To  give  some  in¬ 
tuition,  an  independence  proof  may  generally  work  according  to  the  following  pattern 
(as  do  the  ones  in  [5,  4,  28]).  First,  one  finds  syntactic  conditions  for  two  constraints 
7  and  (p  (in  a  normal  form  for  the  entailment  problem)  which  characterize  when  the 
entailment  j  (p  holds.  Then,  given  7  and  •  •  •,  V^n  such  that  none  of  the  n  im¬ 
plications  7  — >  ^1,  . . . ,  7  — f  holds,  one  adds  constraints  71, . . . ,  7n  to  7  such  that 
the  n  implications  7  A  7^  ->  -tpi  hold,  for  i  =  1, . . . ,  n. 

For  example,  given  7  a:  =  f(u)  A  f{v)  =  y  and  <^1  «  a;  =  y,  we  could  set 

^Such  conjunctions  are  strictly  more  expressive  than  conjunctions  of  only  positive  constraints. 
The  proof  of  this  fact  is  by  easy  modification  of  the  proof  of  the  corresponding  statement  for  a 
different  class  of  set  constraints  (Corollary  3.2)  in  [2]. 
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j^f^u  =  aAv  —  b  (clearly,  7  — >  is  not  valid  over  the  domain  of  finite  trees,  but 

the  implication  7  A  71  ->  is  valid). 

Then  one  uses  the  syntactic  characterization  of  entailment  in  order  to  show  that 
the  addition  of  the  constraints  71, . . . ,7n  to  7  is  consistent.  Now  we  have  that  7  A 
7i  A  . . .  A  7„  is  satisfiable  and  entails  the  conjunction  A . . .  -^(pn-  This  means  that 
the  implication  7  — >  V . . .  V  does  not  hold. 

The  previous  description  shows  that,  globally  speaking,  the  art  of  proving  in¬ 
dependence  is  the  art  of  finding  a  “good”  syntactic  characterization  of  entailment 
(i.e.,  one  that  exhibits  which  conjuncts  one  has  to  add  to  obtain  the  disentail- 
ment  of  non-entailed  constraints).  This,  however,  is  not  evident  for  the  case  of  in¬ 
clusion  constraints,  for  the  following  intuitive  reason.  An  implication  of  the  form 
j  A  X  C  t  A  $  C  y  ->  X  C  j/,  for  two  given  terms  s  and  t,  holds  if,  but  gener¬ 
ally  not  only  if,  7  ->  i  C  5  holds.  This  is  different  for  equations:  The  implication 
ryi\x  —  tAs  —  y-^x  —  y  holds  if  and  only  if  7  t  =  s  holds.  Thus,  in  order  to  make 
the  entailment  oi-*x  —  y  (called  the  “disentailment”  oix  —  y)  hold,  it  is  sufficient  to 
“make  different”  the  two  terms  s  and  t  (as  done  in  the  example  above).  This  will  not 
work  if  =  is  replaced  by  C,  since  forcing  t%s  does  not  force  x  %y. 

The  situation  becomes  even  more  gloomy  when  we  consider  implications  like 

^  AxCtiA...AxQtnASiCy  A.,.ASmQy  -^xCy. 

These  are  valid  if  (but,  again,  not  only  if)  7  entails  the  inclusion  n  . . .  H  C 
U  . . .  U  s„i.  Such  an  inclusion,  however,  is  not  an  INES  constraint. 

Even  checking  of  inclusions  between  intersections  and  unions  (as  in  the  previous 
example),  however,  may  not  suffice  to  derive  the  validity  of  an  implications,  as  our 
last  example  shows: 


X  C  f{y,  ^  Ax  C  f(a,  J)AzCa^zCy. 

The  first  step  of  our  proof  is  to  construct  an  algorithm  for  testing  entailment.  This 
algorithm  in  particular  overcomes  the  above-mentioned  limitations  of  the  syntax  with 
respect  to  union  and  intersection.  One  ingredient  of  the  algorithm  is  to  combine  the 
upper  bounds  ti,...,tn  of  the  variable  x  by  taking  their  “shuffles,”  i.e.,  the  terms 
/(ui, . . . ,  «jk)  where  /(wi,  _,...,_,)  and  .. .  and  /(_, u^)  are  among  x's  upper 
bounds.  In  the  example:  x  C  f  {ui,  U2)AxC  f(vuV2)  A  /(ui,  ^2)  Q  y>  this  is  already 
sufficient  to  derive  that  a:  C  is  entailed.  Another  ingredient  is  the  special  treatment 
of  ground  terms.  We  use  information  that  is  contained  (explicitly  or  implicitly)  in  a 
constraint  to  infer  that  a  ground  term  is  a  lower  bound  of  a  variable  (which  is  what 
is  needed  in  the  last  example). 

Since  the  entailment  algorithm  does  not  yield  a  “good”  syntactic  characterization 
of  entailment,  we  abandon  the  technique  of  forcing  disentailment  by  adding  corre¬ 
sponding  conjuncts.  Instead,  we  will  use  a  certain  “minimal”  solution  1/  of  a  given 
constraint  7  such  that  i/(a:)  is  not  a  subset  of  i/(y)  whenever  the  inclusion  x  C  y  is 
not  entailed  by  7.  How  do  we  get  this  solution  The  standard  canonical  solution 
of  7  corresponding  to  its  solved  form  [25]  is  not  a  candidate  for  u  since  it  is  the  max¬ 
imal  solution  of  7,  assigning  the  set  of  all  trees  to  every  “unconstrained”  variable  x 
(t.e.,  where  x  Cr  does  not  occur  for  any  term  r).  The  minimal  solution  of  an  INES 
constraint  7  does  generally  not  exist.  We  will,  however,  “complete”  the  constraint  7 
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such  that  for  each  of  its  variables  x  it  contains  a  conjunct  t  Q  x  where  t  is  a  ground 
term.  Then  we  can  show  that  a  minimal  solution  does  exist  (and  is  unique  on  the 
variables  that  occur  in  7).  The  remaining,  technically  somewhat  involved  problem  is 
to  complete  the  constraint  such  that  (1)  the  constraint  remains  consistent,  and  (2)  for 
all  pairs  of  variables  x  and  y  where  the  inclusion  x  Cyis  not  entailed,  there  exists  at 
least  one  ground  term  t  below  x  (i.e.,  t  C  x  occurs)  such  that  t  C  y  is  not  entailed. 
Then,  the  value  of  y  under  the  minimal  solution  will  not  contain  the  tree  t.  Hence 
we  obtain  a  solution  of  the  completed  constraint  (and  thus  of  the  original  one)  which 
satisfies  -^x  Cy. 

3  Formal  statement  of  the  result 

We  assume  given  an  infinite  signature  E  fixing  the  arity  n  >  0  of  its  function  sym¬ 
bols  /,  a,  6, . . .  and  infinite  set  V  of  variables  x,  y,  z,  u,  v,w,., ,.  We  use 
as  meta- variables  for  ground  terms,  t, ri,T2,...  for  (possibly  non-ground)  terms  of 
depth  <  1  and  0,  ^i,  O2  for  (possibly  non-ground)  terms  of  arbitrary  depth.  We  write  u 
for  the  tuple  (ui, . . . ,  u„)  of  variables  and  t  for  the  tuple  (<i, . . . ,  4)  of  ground  terms, 
where  n  >  0  is  given  implicitly  (e.p.,  in  x  C  f(u)  by  the  arity  of  the  function  sym¬ 
bol  /).  We  write  u  C  v  for  {ui  G  C  The  notation  ^(7)  stands  for 

set  of  symbols  occurring  in  7,  and  V(7)  for  its  variables.  We  use  $[u,  z]  for  a  term 
containing  possibly  occurrences  of  variables  u,  Zi, . . . ,  z„;  in  the  context  of  9[u,  z],  the 
term  $[v^  z]  denotes  the  term  6  with  the  occurrence  of  u  replaced  by  occurrence  of  v 
at  the  same  position.  We  use  9[ulv!]  to  denote  the  term  $  with  an  occurrence  of  u 
replaced  by  u'. 

We  are  interested  in  inclusions  between  arbitrary  first-order  terms.  However,  to 
simplify  the  presentation,  we  may  assume  (wlog.  and  without  changing  the  complexity 
measure)  that  our  constraint  formulas  7  are  finite  sets  of  inclusions  of  a  restricted  form, 
namely  either  ti  C  t2  t.e.,  between  variables  or  flat  terms,  or  t  C  r  t.e.,  between  a 
ground  term  on  the  left-hand  side  and  a  variable  or  a  flat  term  on  the  right-hand  side. 
As  is  usual,  we  identify  a  conjunction  of  constraints  with  the  set  of  all  conjuncts. 

Our  interpretation  domain  is  the  set  of  all  non-empty  sets  of  finite  trees  (i.e., 
ground  terms)  over  the  signature  E.  A  valuation  is  a  mapping  assigning  non-empty 
sets  of  finite  trees  to  variables.  Each  valuation  1/  can  be  extended  in  canonical  way  to 
a  mapping  1/  from  terms  to  non-empty  sets  of  finite  trees  by  putting 

I'Uiri, . . . ,  r„))  =  {f{tu .  - . ,  t„)  I  <<  €  v{Ti)  for  i  =  1, . . . ,  n}. 

A  valuation  u  satisfies  7  if  for  all  constraints  Ti  C  r2  in  7  we  have  v{ri)  C  i/(t2). 
We  say  that  7  is  satisfiable  if  there  exists  valuation  satisfying  7,  and  that  7  entails  <py 
written  7  f=  </?,  if  all  valuations  satisfying  7  satisfy  (p. 

The  following  remark  says  that  we  may  restrict  ourselves  to  satisfiability  problems 
where  all  negated  constraints  are  of  the  form  x  g  y. 

Remark  1  Let  7  be  a  constraint,  and  XiyX2  fresh  variables,  f.e.,  XiyX2  g  V(7).  Then 
7  U  {^1  g  02}  is  satisfiable  iff  7  U  {xi  C  ^1, 6^2  C  ^2,  a:i  g  X2}  is  satisfiable.  □ 
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We  thus  may  state  our  main  result  as  follows. 

Theorem  1  (Independence)  Let  7  be  a  constraint.  Then  7  U  Ui{^i  2  ?/»}  is  satis- 
fiable  iff  7  U  g  y,-}  is  satisfiable  for  all  i. 

Note  our  assumption  about  the  infinite  signature.  In  the  case  of  a  finite  one, 
the  independence  property  does  generally  not  hold.  For  a  counter  example,  con¬ 
sider  E  =  {a,  /}  with  a  being  a  constant  and  /  being  unary,  and  the  constraint  7  = 
{f{^)  Q  Vifiy)  ^  y}‘  Then  7  implies  a  C  xV  x  C  y,  but  it  implies  neither  a  C  x 
nor  X  Cy, 

We  will  now  refer  to  the  axioms  in  Table  1.  Axiom  1  needs  to  come  in  two 
versions  since  a  ground  term  may  be  the  lower  bound  of  a  term.  As  for  the  restriction 
in  the  formulation  of  Axiom  2,  note  that  without  it,  the  closure  of,  for  example,  the 
constraint  a  C  x  A  f(x)  C  x  under  the  consequences  of  the  axiom  would  be  an  infinite 
set  of  constraints. 

Axiom  3  could  be  stated  in  the  more  formal  (but  less  readable)  way: 

^  •  •  •  j  •  •  •  jX  C  f{Umif  •  •  •  >  Wmn)  ^  ^  ^  •  •  •  j 

where  ji, . . .  ,in  ^  {L  •  •  •  W*  Similarly,  the  first  part  of  Axiom  4  can  be  stated  as 
7  r  C  r  if  r  occurs  in  7,  and  n  C  r2  A  r2  C  r3  ri  C  ra,  and  t  C  ti  A  ri  C  T2 
t  C.T2.  The  other  axioms  use  a  notation  that  is  introduced  in  the  following  definition. 

Definition  2  (implicit  occurrence:  “€”)  Given  the  ground  term  t,  we  say 
that  X  Qt  occurs  implicitly  in  7,  and  write  a;  C  t  €  7,  if  or  the  following,  induc¬ 
tively  defined  condition,  holds: 

•  either  t  is  a  constant  symbol  and  a:  C  t  €  7,  or 

•  t  =  /(ti, . . . , and  X  C  f(xi, . . . , ajn)  €  7  and  Xi  for  all  i. 

We  say  that  a  term  t  occurs  implicitly  in  7  if  a;  C  i  occurs  implicitly  in  7  for  some 
variable  x. 

Remark  3  The  axioms  in  Table  1  are  valid  over  non-empty  sets  of  finite  trees. 

Proof.  The  proof  is  done  by  inspection  of  each  axiom.  For  the  Axiom  5,  note  that  if  i/ 
satisfies  x  Ct  then  i/{x)  =  {t}.  For  the  last  axiom,  the  constraint  x  C  /(...,  u, .. .) 
and  implicit  occurrence  of  u  C  t  in  7  implies  that  if  is  a  solution  of  7  then  all 
trees  in  j/{x)  must  have  the  subtree  t  on  the  respective  position;  the  constraint  x  C 
/(...,u,...)  implies  then  that  t  €  i'(u).  D 

Definition  4  (closed)  We  say  that  a  constraint  7  is  closed  if  it  contains  all  its 
consequences  according  to  the  axioms  in  Table  1. 

As  a  direct  consequence  of  Remark  3  (formulated  as  a  corollary  below),  we  may 
restrict  our  logical  investigation  of  the  entailment  problem  to  closed  constraints.  Also, 
in  the  following  we  will  concentrate  on  constraints  that  are  satisfiable.  Satisfiability 
can  be  tested  in  cubic  time  [25]. 

Remark  5  Let  y  be  the  closure  of  the  constraint  7  under  of  consequences  of  the 
axioms  in  Table  1.  Then  7(=xCj/iffy|=a:Cy.  □ 

Theorem  2  (Entailment)  If  7  is  a  satisfiable  and  closed  constraint,  then 

Cy  iff  x  Cy 
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1.  /(«)  C  f{v)  —>  u  C  V 
/(i)  Q  f{u)  ~^iCu 

2.  uCv->  f(u)  C  f{v) 

'y  AtCu-y  /(i)  C  f{u)  if  /(i)  occurs  in  7 

3.  a;  C  /(ui, . . .)  A  . . .  A  a;  C  /(. . .  «„)  -4  x  C  /(ui, . . .  Un) 

4.  the  relation  C  is  reflexive  and  transitive 
^  At  <Zy  X  Cy  ifxCt€7 

5.  j  -^tCx  if  xCtej 

6.  7  Ax  C /(..., u,...)  Ax  C /(..., V,...) t  C  w  ifuCt€7 


Table  1:  Axioms  for  inclusion  constraints  over  non-empty  sets  of  finite  trees 

Entailment  algorithm.  We  define  an  algorithm  by  fixed  point  iteration.  At 
each  iteration  step,  we  add  the  consequences  under  the  axioms  in  Table  1  of  the  set 
of  constraints  derived  so  far.  After  termination,  the  obtained  set  of  constraints  is 
equivalent  to  the  initial  one  and  it  is  closed.  Hence,  by  Remark  5  and  Theorem  2, 
entailment  is  tested  by  checking  membership  (for  inclusions  between  variables,  which, 
according  to  Remark  1,  is  sufficient). 

How  many  applications  of  each  rule  are  possible  before  a  fixed  point  is  reached, 
if  n  is  the  size  of  7?  Axiom  5  can  be  applied  only  on  pairs  (x,  t)  such  that  7  entails 
that  X  Qt.  Since  7  is  assumed  to  be  consistent,  there  are  at  most  n  such  pairs.  Sim¬ 
ilarly,  the  number  of  applications  of  Axiom  6  is  bounded  by  the  number  of  variables 
y  times  the  number  of  triples  where  j  is  the  argument  position  containing, 

say  the  variable  Uj,  where  Uj  C  t  occurs  implicitly  in  7  for  some  ground  term  t.  The 
number  of  such  triples  is  bounded  by  k-n,  where  k  is  the  (assumed  a  priori  fixed)  max¬ 
imal  arity  of  all  function  symbols  /  occurring  in  7.  This  is  because  the  conjunction  of 
^  ^  /(•••»  ••  •)  and  X  C  /(. . . ,  . . .)  is  unsatisfiable  (over  the  domain  of  nonempty 

sets  of  trees!)  for  t  ^  t'.  Thus,  the  number  of  consequences  under  Axioms  5  and  6  is 
bounded  by  n  +  A:  •  n^,  and  at  most  n  -I-  A:  •  n  new  explicitly  occurring  ground  terms  t 
can  be  introduced.  There  are  at  most  0(n^)  many  consequences  of  Axiom  1,  since  the 
consequences  are  inclusions  between  variables  or  between  ground  terms  and  variables, 
and  we  do  not  introduce  new  variables,  and  we  introduce  at  most  n  -f  A:  •  n  new  ground 
terms.  There  are  at  most  0(n^^*'*'^^)  consequences  of  Axiom  2,  Axiom  3  and  Axiom  4, 
since  the  consequences  are  inclusions  between  variables,  explicitly  occurring  ground 
terms  and  terms  f(u).  There  are  at  most  terms  of  the  form  f{u)  (since  the 
number  of  function  symbols  occurring  in  7  is  bounded  by  n). 

Since  the  work  at  each  iteration  step  can  clearly  be  done  in  polynomial  time, 
this  rough  approximation  of  the  complexity  of  the  entailment  algorithm  shows  that 
it  is  polynomial.  We  believe  that  a  cubic  algorithm  can  be  derived  from  the  refined 
analysis  of  data  structures  and  the  execution  strategy.  We  leave  this  for  future  work. 
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4  Formal  proof 

In  order  to  prove  Theorems  1  and  2  (which  we  complete  at  the  end  of  this  section),  we 
need  a  “minimal”  solution  of  a  given  solved  satisfiable  constraint  7.  Unfortunately, 
such  a  solution  generally  does  not  exist.  In  order  to  be  able  to  construct  such  a 
solution,  which  we  will  do  in  the  second  part  of  this  section,  we  have  to  “complete”  7 
by  adding  a  ground  lower  bound  for  each  variable  occurring  in  7. 

Completing  with  ground  lower  boiinds.  The  following  two  notions  are  used  in 
Definition  8  and  in  subsequent  (inductive)  proofs. 

Definition  6  (chain  of  proper  upper  bounds)  We  say  that  a  variable  x  has  a 
proper  upper  hound  in  7  if  7  contains  a  constraint  of  the  form  x  C  f(xi,..  .,Xn).  A 
chain  of  proper  upper  bounds  for  a:  in  7  is  a  sequence  of  constraints  of  the 
form  Xi  C  fi(. , .  ,a;*+i, . . .),  with  xq  =  x. 

Remark  7  Every  chain  of  proper  upper  bounds  in  a  satisfiable  constraint  7  is  finite. 

Proof.  If  the  variable  u  has  an  infinite  chain  of  proper  upper  bounds  in  7,  then  there 
exists  a  term  B  such  that  7  |=  «  C  B[u\. 

Let  be  a  solution  of  7  and  let  s  be  a  tree  of  minimal  depth  in  Then  s  must 
be  of  the  form  B[s']  where  5'  €  ^{u).  This  contradicts  the  minimality  of  s.  □ 

Definition  8  (lower  bound  completion)  Let  7  be  a  satisfiable  and  closed  con¬ 
straint-  We  say  that  Y  is  an  m-level  lower  bound  completion  of  7  if  the  following 
conditions  hold: 

1.  y  =  7  U  {<1  C  a;i, . . .  ,<n  Q  ^n}  where  all  tiS  are  ground  terms  not  occurring 
in  7,  and  all  Xi^s  are  variables  occurring  in  7; 

2.  y  is  satisfiable; 

3.  7  1=  u  C  V  iff  y  [=  u  C  u  for  any  two  variables  u,  v; 

4.  if  Xi  has  no  proper  upper  bound  in  7,  then  is  a  constant; 

5.  if  f{u)  is  a  proper  upper  bound  of  Xi  in  7,  then  ti  is  of  the  form  ti  =  f(s) 
and  s  C  u  e  Y; 

6.  if  ti  =  f(s)  and  s  C  j/  €  y,  then  Xi  C  f(y)  €  7; 

7.  if  the  maximal  chain  of  proper  upper  bounds  for  a:  in  7  is  of  length  <  m,  and 
if  7  X  C  t  for  any  ground  term  t,  then  x  G  {xi, . . .  ,Xn}. 

We  say  that  y  is  a  lower  bound  completion  of  7  if 

•  y  is  an  m-level  lower  bound  completion  of  7  for  all  m; 

•  for  every  variable  x,  either  there  exists  a  ground  term  t  such  that  7  |=  x  C  t,  or 
there  exists  a  term  tx  such  that  txQx  eYi  a-nd  4  is  unique  for  x,  f.e.,  tx  Y 
for  X  y  2/; 


•  if  a:  C  2/  ^  7,  then  txQy  ^Y' 
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The  next  two  lemmas  lead  to  Proposition  11. 

Lemma  9  Given  a  satishable  cx^nstraint  7  and  a  variable  u  without  a  proper  upper 
bound  in  7  and  a  constant  not  occurring  in  7,  the  constraint  y  defined  by  y  = 
7  U  C  u}  is  satisfiable,  and  'i\=xCy\^^]^x<Zy  (for  all  variables  x  and  y). 

Proof.  We  extend  an  arbitrary  solution  1/  of  7  to  a  valuation  1/  as  follows.  We  first 
introduce  a  new  variable  v!  and  put  v{u')  =  v{u)  U  {ot*}.  Then,  for  v  €  V(7)  we  set 

i/{v)  =  i/{v)  U \j{u{e[u\  z])  I  7  [=  e[u,  z]  C  v}. 

Since  7  [=  «  C  u,  we  have  that  i/{u)  D  v{u)  U  {0^,}.  If  7  ^  C  u  for  any  term  6 

containing  an  occurrence  of  u,  then  t/{v)  =  i/{v). 

Clearly,  i/'  satisfies  the  constraint  Cu.  All  other  conjuncts  of  y  are  also  in  7. 
We  consider  their  different  forms,  namely 

•  t  C  x:  t  €  i'{x)  and  j/(x)  C  i/(x)  imply  that  1/  satisfies  t  C  x; 

•  xCy:  Since  j/(x)  C  z/(y),  and  j^9Cx  implies  7  f=  ^  ^  also  i/{x)  C  i/(y); 

•  /(^ij  •  •  •  C  y:  Let  /(ti, . . ,  ,t„)  be  an  arbitrary  tree  in  i/(/(a;i, . .  .^Xn))  — 

/{i/(xi), . . . ,i/(ar„)).  We  will  define  n  terms  such  that  7  (= 

/(^i, . . . ,  ^,1)  C  y  and  /(ti, . . . ,  . . . ,  dn)\ulu’]). 

If  U  €  uixi)  then  let  Oi  ~  Xi.  Otherwise,  if  i*  €  i/(xi)  -  iy(xi)^  then  let  9i  be  such 
a  term  that  7  |=  C  x*  and  U  G  j/{$i[u/u^). 

Thus,  7  f=  C  /(xi,...,x„).  Since  /(xi, ...,Xn)  Q  y  €  y,  we  have 

that  7  h  /(^i, . . . .  C  y.  Thus,  fiti, . . . ,  t„)  €  - .  • , 

•  X  C  /(yi,...,2/n)‘  Let  t  €  i/(x).  If  <  €  i/(x),  then  t  €  i/(/(yi, . . . ,  j^„))  C 

•  •  •  >2/n))»  and  we  are  done. 

Suppose  t  G  z/(x)  —  i/(x).  Then  there  exists  a  term  0[uj  z]  such  that  7  0[u,  C  x 

and  t  G  i/(^[u',^]).  Since  7  (=  ^[«, ^]  C  /(yi,...,y„)  and  u  has  no  proper  upper 
bounds,  0[u,z]  cannot  be  equal  to  u.  Since  t  ^  i/(x),  B[uyz]  must  be  a  com¬ 
posed  term  of  the  form  /(^i[u,  ^], . . .  f]).  Now  7  |=  /(^i[u,  z], . . . ,  z])  C 

f{yu  •  •  • » yn),  which  implies  7  |= 

This  means  that  z])  C  1/(2/^)  and  z])  C  /(z/(j/i), . . . , 

Hence  t  G  i/(/ (2/1,..., j/„)). 

•  /(w)  ^  /(v)  (or  /(^  C  /(u)):  Since  this  conjunct  is  redundant  in  the  closed 
constraint  7,  its  satisfaction  follows  from  the  satisfaction  of  the  constraints  x  C  y 
(respectively  t  C  x)  shown  above. 

It  is  easy  to  see  that  7h3;Cyifryf=:xCy.  Namely,  if  7  [=  x  C  y 
then  y|=a;Cy.  If7^xCy  then  there  exists  v  satisfying  i/(x)  g  u(y),  and,  thus, 
i/{x)  2  i/(y).  □ 

Lemma  10  For  every  satisfiable  and  closed  constraint  7  and  all  natural  numbers  k 
and  m,  there  exists  an  m-level  lower  bound  completion  y  of  7  such  that: 

•  if  the  chain  of  upper  bounds  for  x  is  of  length  <  m,  and  if  7  x  C  t  for  any  ground 
term  i,  then  there  exist  k  different  ground  terms  ti, . .  .4  such  that  U  C  x  Q  y; 

•  if  the  ground  term  t  occurs  in  7  and  tCyeY,  then  also  t  Cy  ej. 
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Proof.  The  proof  goes  by  induction  on  m.  For  m  =  0,  we  obtain  Y  by  successively 
applying  k  times  Lemma  9  to  each  variable  in  7  without  a  proper  upper  bound. 

For  the  induction  step,  let  u  be  a  variable  with  the  maximal  chain  of  proper  upper 
bounds  of  length  equal  to  m  +  1  and  such  that  7  ^  u  C  t  for  any  ground  term  t. 

Consider  all  proper  upper  bounds  uC  f(v)  for  u.  Note  that  by  assumption  u  does 
have  such  bounds,  and  since  7  is  satisfiable,  all  of  them  must  have  the  same  function 
symbol  /.  Let  ~  (vui, . .  - ,  Vtm)  be  a  sequence  of  fresh  variables  of  the  length  equal 
to  the  arity  of  /,  and  let  Tc  be  the  closure  under  the  axioms  in  Table  1  of 

7  U  C  i;  I  w  C  f{v)  €  7}. 

It  is  easy  to  see  that  %  is  satisfiable.  Namely,  if  is  a  solution  of  7,  then  we  get 
solution  of  7c  by  putting  =  ntiC/(i;)€7 

All  variables  in  have  the  chain  of  proper  upper  bounds  of  length  at  most  m,  and 
there  exists  a  variable  in  the  sequence  Vu  such  that  7c  ^  v^j  C  t  for  any  ground  i. 

Thus,  by  the  induction  hypothesis,  there  exists  an  m-level  lower  bound  comple¬ 
tion  (p  of  7c  with  k  different  lower  bounds  for 

If  there  exists  a  ground  t  such  that  7  f=  Vui  C  set  U  =  t;  otherwise,  let  U  be  any 
lower  bound  for  v^i  in  (p  (for  z  =  1, . . . ,  n).  Now  we  prove  that  y?'  is  satisfiable,  where 

(p'  =^(pU{f(tu..,,tn)Qu}. 

Let  be  any  solution  of  (p.  Similarly  as  we  did  it  in  Lemma  9,  we  introduce  a  new 
variable  u*  and  put  U  {/{ti,  •  • .  <n)}-  We  then  define 

i^{y)  =  J^(y)  u  \JM0[u^,  ^])  1 7 1=  z]  c  y}. 

The  proof  that  1/  satisfies  the  constraints  in  (p'  is  essentially  the  same  as  in  Lemma  9, 
except  for  the  case  of  constraints  of  the  form  x  C  /(j/i, . . .  yn)>  where  ^[u,  z]  can 
be  equal  to  u.  But  then  t  e  u(u^)  -  that  is,  t  =  Since  (p  |= 

0[u,i]  C  X  and  6[u,z]  =  u  and  x  C  f(y)  e  <p,  we  have  v?  f=  u  C  f(y).  Hence  f(y) 
is  an  upper  bound  for  u,  and  by  condition  5  of  Definition  8,  €  ^{y)- 

Therefore  t  =  /(ti, . . .  ,tn)  ^  HfiV))  ^  This  means  that  the  constraints  of 

the  form  x  C  f(y)  are  satisfied  under  1/. 

Now,  to  obtain  the  constraint  y  satisfying  the  induction  statement,  we  first  re¬ 
peat  the  same  construction  for  k  different  lower  bounds  tj  of  (which  exist  by  the 
induction  hypothesis)  and  for  all  variables  u  with  the  chain  of  proper  upper  bounds 
of  length  m-f  1.  We  then  “remove”  all  occurrences  of  the  variables  fit*  which  we  have 
added  for  the  construction  of  7c.  Namely,  first,  for  each  such  variable  each  term  t 
and  each  variable  y,  if  t  C  v^i  €  Y  and  v^i  Cy  £  Yi  then  we  add  to  Y  constraint 
t  Cy.  Second,  we  remove  from  Y  all  constraints  with  an  occurrence  of  this  variable. 

We  still  have  to  prove  that  Y  satisfies  the  conditions  in  Definition  8.  All  of  them 
except  3  and  6  are  either  easy  or  already  proven. 

Ad  3.  Clearly,  if  x,  €  V(7)  then  7  [=  a:  C  y  iff  93'  (=  x  C  y.  This  is  because  if 
j/(x)  2  ^{y)  then  t/(x)  g  *^(2/),  and  if  7  f=  x  C  y  then  (p^  ^  x  C  y.  It  is  also  easy  to 
see  that  ^  ^  x  C  y  ik  (p'  x  C  y. 

Ad  6.  If  f{i)  C  u  was  added  to  Y  then  i  C  y  £  ^  iff  C  y  £  This  implies 
that  each  variable  in  y  occurs  in  some  upper  bound  u  C  f{v)  and,  thus,  u  C  f(y)  £  7 
by  Axiom  3. 

To  prove  the  second  statement  of  the  lemma,  let  t  occur  in  7  and  t  C  y  €  Y- 
The  only  possibility  that  t  C  y  occurs  in  Y  *“  7  is  the  following.  There  must  exist  a 
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variable  u  such  that  f  C  e  7c  for  some  i  and  C  y  e  7c.  This  is  possible  only 
if  w  C  t, .. .)  occurs  implicitly  in  7  and  u  C  /(. . . ,  2/, . . .)  €  7.  But  then,  by 

Axiom  6,  t  C  y  €  7.  □ 

Proposition  11  There  exists  a  lower  bound  completion  for  every  satisfiable  and 
closed  constraint  7. 

Proof.  Let  m  be  the  maximal  length  of  a  chain  of  proper  upper  bounds  in  7.  We  mod¬ 
ify  the  m-level  lower  bound  completion  constructed  in  Lemma  10  in  order  to  satisfy 
the  second  and  the  third  condition  in  the  definition  of  a  lower  bound  completion. 

We  can  choose  a  term  that  is  a  unique  lower  bound  for  x  if  we  choose  the 
parameter  k  in  the  induction  statement  greater  or  equal  to  the  number  of  variables 
occurring  in  7. 

In  order  to  obtain  that  C  y  ^  7'  if  x  C  y  ^  7,  we  remove  all  constraints  from  y 
that  are  of  the  form  t  C  y  where  t  |  C  y  G  7}  and  t  is  not  a  subterm  of  tz  for 
some  other  variable  z.  This  is  possible  if  we  choose  the  parameter  k  in  the  induction 
statement  greater  or  equal  to  the  square  of  the  number  of  variables  occurring  in  7.  □ 

Constructing  a  minimal  solution.  The  next  lemma  explains  why  the  completion 
of  a  constraint  7  with  ground  lower-bounds  applies  only  to  variables  x  where  y  ^  x  Ct 
(see  Condition  7  in  Definition  8  )  if  7  is  closed  (in  particular  under  Axiom  5). 

Lemma  12  Let  7  be  a  satisfiable  and  closed  constraint  and  let  t  be  a  ground  term. 
If  7  [=  ar  C  then  x  Ct£j. 

Proof  Note  that  if  x  has  no  proper  upper  bound,  then  y  ^  x  C  t  (this  follows 
from  the  results  in  [25],  saying  that  the  maximal  solution  for  7  assigns  the  set  of  all 
trees  to  z,  or  from  Proposition  9).  Therefore,  x  does  have  proper  upper  bounds  in  7, 
and  since  7  is  satisfiable,  the  head  function  symbol  in  t  must  coincide  with  the  head 
function  symbol  of  all  proper  upper  bounds  of  z. 

Now  the  proof  goes  by  induction  on  the  structure  of  t.  If  t  is  a  constant  symbol, 
then  the  only  possible  proper  upper  bound  of  z  is  z  C  t,  and  we  are  done. 

If  t  .^(^1) ...  7  ^n)  then  let  f  {un^  •  •  •  7  ^in)?  •  •  *  7  f  {y'mii . .  ■  7  '^mn)  be  the  list  of  all 
proper  upper  bounds  of  z.  For  each  argument  position  i  €  {1, . . . ,  n}  let  us  consider 
all  the  variables  wi,-, . . . ,  For  at  least  one  of  them,  say  uy,.,  we  have  7  |=  uj.i  C  ti 
(if  this  is  not  the  case,  then  y  x  <Z  t).  By  the  induction  hypothesis  we  have 
that  Ujii  C  ti  occurs  implicitly  in  7.  By  Axiom  3,  z  C  /(z/jji, . . . ,  Uj^n)  €  7  and,  thus, 
also  z  C  t  €  7.  □ 

Definition  13  Let  y  be  a  lower  bound  completion  of  the  satisfiable  and  closed  con¬ 
straint  7,  let  y'  be  a  closure  of  y  under  transitivity.  Let  1/  be  the  minimal  (under  point- 
wise  inclusion)  valuation  satisfying,  for  all  finite  trees  t,  the  condition:  t  e  iy(x)  iff 

•  t  C  z  €  y^  or 

•  t~  f(ti, . . . ,  t„),  /(zi, . . . ,  z„)  C  z  G  y',  and  ti  €  i/(xi)  for  all  i. 

We  call  1/  the  minimal  solution  of  y'. 

Our  terminology  is  justified  by  the  lemma  below. 
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Lemma  14  In  the  situation  of  Definition  13,  the  valuation  u  satisfies  Y'. 

Proof.  First  note  that  v  assigns  nonempty  sets  of  terms  to  variables.  This  is  because 
if  7  1=  X  C  t  for  some  ground  term  t  then,  by  Lemma  12  and  Axiom  5,  t  €  i'(x),  and  if 
7  ^  X  C  i  for  any  ground  term  t  then,  by  Definition  8,  Y  contains  constraint  tx  Q  x. 

The  constraints  in  Y^“7  are  of  the  form  i  C  r  for  a  ground  term  t.  Thus,  they  are 
satisfied  by  the  definition  of  v.  We  now  show  that  all  constraints  in  7  are  satisfied. 

•  If  X  C  y  €  7,  then,  for  t  e  i^{x),  there  are  two  possibilities: 

-  t  C  X  e  Y'  Then  t  C  y  ^  Y  si^^ce  Y  is  closed  under  transitivity  of  C,  and, 
thus,  t  €  y(y)\ 

~  t  is  of  the  form  t  =  and  /(xi,...,Xn)  C  x  €  7  and  U  €  i/(x,). 

Thus,  again  by  transitivity,  /(xi, . . . ,  Xn)  Qy  €.Yi  and  t  € 

•  If  ^  €  7,  then  ti  €  u{xi)  implies,  by  the  definition  of  u,  that 

•  If  X  C  /(xi, . . . ,  Xn)  €  7,  then  a  tree  t  €  u{x)  cannot  be  of  the  form  ^(<1, . . . ,  ^m) 
with  f  ^  g,  since  then  7  would  not  be  satisfiable.  Thus,  t  must  be  of  the 
form  /(ti, . . . ,  tn).  There  are  two  possibilities: 

-  tCx€Y^*  If<Cx€7  then  by  closeness  of  7  we  have  U  C  Xi  €  7. 
If  t  C  X  €  Y^  —  7,  then  U  Q  Xi  G  Y  Condition  5  of  Definition  8.  Thus,  in 
both  cases  t  €  /(t^{xi), . . . ,  i^{xn)). 

-  f(yu  -  •  • ,  yn)  C  X  G  Y'  and  U  G  2/(2/, )•  Then,  since  f{yu . . . ,  yn)  is  not  a  ground 
term,  /(yi,. . .,yn)  C  x  €  7.  Since  7  is  closed,  y<  C  Xj  G  7.  Thus,  U  G  i^{xi) 
and  t  G  /(V(xi), . . . ,  i/(xn)). 

•  If  f(u)  C  f{v)  G  7  (or,  if  /(i)  C  f{u)  G  7),  then,  since  this  conjunct  is  redun¬ 

dant  in  the  closed  constraint  7,  its  satisfaction  follows  from  the  satisfaction  of  the 
constraints  x  C  y  (respectively  t  C  x)  shown  above.  □ 

Lemma  15  In  the  situation  of  Definition  13,  if  t  is  a  ground  term  occurring  in  Y  —  7 
and  t  G  i^{y)i  then  either  xQy  G^  and  t  C  x  G  Y?  or  t  C  y  G  Y* 

Proof.  We  first  recall  some  basic  facts.  All  constraints  in  Y^  ~  7  have  ground  terms 
on  the  left-hand  side  of  the  inclusion.  Thus,  if  r  C  r'  G  Y^  and  r  is  not  ground, 
then  r  C  r'  G  7.  Furthermore,  the  constraints  in  Y  ~  7  have  ground  terms  on  the  left- 
hand  side  of  the  inclusion  and  variables  (i.e.,  not  composed  terms)  on  the  right-hand 
side. 

The  proof  of  the  lemma  goes  by  induction  on  the  definition  of  the  minimal  so¬ 
lution  1/  of  Y-  Let  t  be  a  ground  term  occurring  in  Y  “  7  such  that  t  G  i/(y). 
Then,  by  the  definition  of  p,  either  t  C  y  G  Y'  r>r  ^  =  /(<1j  •  •  •  j  ^n)?  U  ^  ^(^i) 
and  /(xi,...,x„)  CyeY- 

Induction  base:  t  C  y  G  Y^  Then,  since  Y  is  a  transitive  closure  of  Yj 
ther  tCyG^  (and  we  are  done),  or  there  exists  a  variable  x  such  that  t  C  x  G  Y 
and  X  Cy  £Y‘  Since  x  is  a  variable,  x  C  y  G  7. 

Induction  step:  t  =  /(ti,. . . ,t„),  where  U  G  p(xi)  and  /(xi,...,Xn)  C  y  G  Y'- 
Since  /(xi, . . . ,  x„)  is  not  ground,  /(xi, . . . ,  Xn)  C  y  G  7.  By  the  induction  hypothesis. 


88 


for  all  i,  we  have  either  U  Q  €  y  or  there  exists  a  variable  u,-  such  that  U  C  m  €  Y 
and  UiQxiej. 

Let  a:  be  a  variable  such  that  t  C  x  €  Y-  By  Condition  6  of  Defini¬ 
tion  8,  if  ti  QVi^Y  /(j/i)  •  •  •  j2/n)  is  an  upper  bound  for  x  in  7.  Let  yi  —  Xi 
^^UQ^i^Yy  yi  =  Ui\i  ti  Cui^Y.  Then,  for  all  i,  we  have  yi  C  Xi  G  7.  Thus, 
by  Axiom  2,  /(yi,  • . . , yn)  Q  /(^i,  •  •  •  ,a:„)  €  7,  and  by  transitivity  a:  C  y  €  7.  □ 

Corollary  16  In  the  situation  of  Definition  13,  we  have:  tx  G  ^'(y)  iff  re  C  y  €  7. 

Lemma  17  In  the  situation  of  Definition  13,  if  t  is  a  ground  term  that  occurs  in  7 
and  t  G  i^(rc),  then  t  C  a;  G  7. 

Proof,  The  proof  goes  by  induction  on  the  structure  of  t.  If  i  is  a  constant  symbol 
then  t  C  X  gY'  from  the  definition  of  i/,  and  the  thesis  of  the  lemma  follows  from 
Lemma  10. 

Now  suppose  t  =  /{ii, . . . ,  Q.  If  i  C  a;  G  7"  then,  again  by  Lemma  10,  t  C  a;  G  7 
and  we  are  done.  Otherwise,  U  G  i/(a:,)  and  /(a:i, . . .  ,a:„)  C  a:  G  7.  Then,  by  the 
induction  hypothesis,  U  C  Xi  €  7.  Since  /(fi, . . .  occurs  in  7  and  7  is  closed  (in 
particular  under  Axiom  2),  /(ti,...,4)  Q  /(a:i, . . .  ,a:„)  G  7.  Thus,  by  transitivity, 
/(ii,...,tn)  ^  a:G  7.  □ 

Proof  of  Theorem  2  (Dntailment).  The  “iP  direction  of  the  proof  is  obvious.  For 
the  “only  ir  direction,  let  YyY'  and  be  as  in  Definition  13.  We  assume  x  C  y  ^  7. 

•  If  there  exists  a  ground  term  t  such  that  y\=xCt,  then  x  C  t  G  7  by  Lemma  12. 
Since  x  C  y  ^  7,  by  Axiom  4  we  have  t  C  y  ^  7.  This  implies  t  Cy  ^  Y'^ 
Now  Lemma  17  yields  that  t  ^  i/(y).  Thus,  v(x)  g  y(y).  Since  1/  is  a  solution 
of  7,  7  X  C  y. 

•  Otherwise  (t.e.,  there  exists  no  ground  term  t  such  that  7  |=  x  C  <),  we  have  4  C 

X  G  7" .  Corollary  16,  together  with  our  assumption  x  Cy  yields  that  tx  G 
j/{x)  —  u(y).  This  means  that  x  C  y.  Since  each  solution  of  7"  is  also  a 
solution  of  7,  we  have  7  x  C  y.  □ 

Proof  of  Theorem  1  (Independence).  The  “only  if’  direction  is  obvious.  For  the 
proof  of  the  “iP  direction,  let  Y,  Y'  and  1/  be  as  in  Definition  13.  By  Remark  5  and 
Theorem  2  we  have  that  Y'  does  not  contain  x,-  C  y^  for  any  i.  Thus,  i/{xi)  %  i/(y,) 
for  all  i.  That  is,  i/  is  a  solution  of  7  U  Ui{^t  2  yt}-  □ 


5  Conclusion 

We  have  shown,  for  the  first  time,  the  independence  property  for  a  natural  (and 
practically  used)  class  of  set  constraints.  We  have  also  given  a  polynomial  entailment 
test.  Together,  this  yields  a  polynomial  satisfiability  test  for  conjuncts  with  negation. 

The  main  interest  of  our  results  is  a  fundamental  one.  As  for  applications,  we  still 
have  to  investigate  how  exactly  our  results  can  help  to  make  more  precise  the  analysis 
of  concurrent  constraint  programs,  which  is  from  where  this  class  of  set  constraints 
originates. 
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As  pointed  out  in  [25],  a  potential  application  of  the  constraint  system  INES  lies 
in  its  use  as  an  instance  of  the  CLP  (A')  scheme  [16].  Our  results  are  relevant  for 
the  semantics  of  CLP  (INES)  programs  (see  [22])  and  imply  that  one  can  manipulate 
negated  constraints  in  the  same  way  as  inequations  in  Prolog*!!  [9].  Having  presented 
an  incremental  entailment  test,  we  may  conceive  to  also  use  iNES  in  a  CCP  language, 
e.p.,  in  Oz. 

We  believe  that  there  are  many  other  interesting  classes  of  set  constraints  with  the 
independence  property.  Let  us  consider,  for  example,  the  extension  of  iNES  constraints 
where  terms  may  be  formed  with  the  union  operator  (see  Footnote  3).  We  conjecture 
that  if  we  exclude  the  cases  where  a  variable  x  is  contained  in  a  finite  union  of  terms 
with  ground  subterms,  then  the  independence  property  holds. 
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Abstract.  The  paper  describes  a  simple  modeling  and  programming 
approach  for  speeding  up  constraint  propagation.  The  idea,  although 
similar  to  redimdant  constraints,  is  based  on  the  concept  of  redundant 
modeling.  We  define  CSP  model  and  model  redundancy  formally,  and 
show  how  mutually  redundant  models  can  be  combined  and  connected 
using  channeling  constraints.  The  combined  model  contains  the  origi¬ 
nal  but  redundant  models  as  sub-models.  Channeling  constraints  allow 
the  sub-models  to  cooperate  during  constraint-solving  by  propagating 
constraints  freely  amongst  the  sub-models.  This  extra  level  of  pruning 
and  propagation  activities  becomes  the  source  of  execution  speedup.  We 
apply  our  method  to  the  design  and  construction  of  a  real-life  nurse  ros¬ 
tering  system.  Experimental  results  provide  empirical  evidence  in  line 
with  our  prediction. 


Keywords:  Constraint  Propagation,  Redundant  Modeling,  Nurse  Rostering 


1  Introduction 

The  problem  at  hand  is  that  of  constraint  satisfaction  problems  (CSP)  defined 
in  the  sense  of  Mackworth  [16],  which  can  be  stated  briefly  as  follows: 

We  are  given  a  set  of  variables,  a  domain  of  possible  values  for  each 
variable,  and  a  conjunction  of  constraints.  Each  constraint  is  a  relation 
defined  over  a  subset  of  the  variables,  limiting  the  combination  of  values 
that  the  variables  in  this  subset  can  take.  The  goal  is  to  find  a  consis¬ 
tent  assignment  of  values  to  the  variables  so  that  all  the  constraints  are 
satisfied  simultaneously. 

CSP’s  are,  in  general,  NP-complete  and  some  are  even  NP-hard  [5].  Thus,  a 
general  algorithm  designed  to  solve  any  CSP  will  necessarily  require  exponential 
time  in  problem  size  in  the  worst  case.  One  widely- adopted  approach  in  solving 
CSP’s  features  various  degrees  of  combinations  of  backtracking  tree  search  and 
constraint  propagation  [19,  12].  This  framework  is  realized  in  the  constraint  logic 
programming  languages  CHIP  [8]  and  C++  class  library  ILOG  SOLVER  [11], 
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as  many  mutually  redundant  models  as  one  can  dream  up.  One  must,  however, 
take  into  account  the  time  and  memory  overhead  of  larger  network  size  in  the 
combined  model.  In  addition,  implementing  an  alternate  model  is  not  a  task  that 
can  be  taken  lightly. 

Another  common  and  important  question  to  ask  is  how  to  generate  a  redun¬ 
dant  model.  We  observe  that  most  real-life  scheduling  and  resource  allocation 
problems  can  be  modeled  reciprocally  in  the  sense  that  if  a  problem  can  be  mod¬ 
eled  as  assigning  objects  of  type  X  to  objects  of  type  Y ,  then  the  same  problem 
can  also  be  modeled  as  assigning  objects  of  type  Y  to  those  of  type  X.  This 
principle  is  general  and  widely  applicable.  At  the  beginning  of  this  section,  we 
have  already  given  an  example  of  applying  the  principle  to  a  generic  job-shop 
scheduling  problem.  Similarly,  we  can  model  a  train  driver  rostering  problem  as 
either  assigning  trips  to  drivers  or  assigning  drivers  to  trips.  We  can  also  model 
a  school  timetabling  problem  as  either  assigning  courses  to  time  slots  in  the 
timetable  or  assigning  a  set  of  time  slots  to  courses.  We  can  find  more  examples 
of  this  types.  It  may  be  argued  that  the  above  resembles  the  resource-centered 
view  and  the  job-centered  view  used  in  the  scheduling  community.  The  novelty 
of  our  proposal  is  not  on  suggesting  multiple  perspectives  on  the  same  problem. 
The  novelty  is  on  suggesting  to  connect  these  perspectives  to  form  a  combined 
perspective  which  exhibits  more  efficient  pruning  behaviour. 

Using  reciprocal  views  of  problems  is  just  one  way  of  obtaining  redundant 
models  but  it  is  not  the  only  way.  The  general  guideline  is  to  study  the  problem 
at  hand  from  different  angles  and  perspectives.  For  example,  Tsang  [21]  suggests 
two  ways  of  modeling  the  8-queens  problem.  The  first  and  familiar  model  consists 
of  eight  variables,  each  of  which  denotes  the  position  of  a  queen  in  a  different 
row  of  the  chess  board.  Every  domain  contains  the  eight  possible  positions  of 
a  queen.  The  alternative  model  consists  also  of  eight  variables,  each  of  which 
denotes  the  position  of  the  queen  on  either  one  of  the  sixty-four  squares  on  the 
chess  board.  Thus,  the  domain  of  each  variable  becomes  {1, . .  .,64}.  It  is  not 
difficult  to  come  up  with  yet  another  model  consisting  of  sixty-four  variables, 
each  of  which  has  domain  {0,1}.  Each  variable  denotes  a  square  on  the  board.  A 
value  of  0  denotes  an  empty  square  and  a  value  of  1  denotes  a  square  occupied 
by  a  queen. 

Channeling  constraints  for  models  Mi  and  M2  must  be  able  to  propagate 
constraints  from  Mi  to  M2  and  vice  versa.  Suppose  Mi  is  modeled  after  assigning 
objects  of  type  X  to  those  of  type  Y  and  M2  is  modeled  after  assigning  objects 
of  type  y  to  those  of  type  X.  An  effective  channeling  constraint  is  of  the  form: 

The  variable  associated  with  object  x  of  type  X  has  object  y  of  type  Y  as 
value  if  and  only  if  the  variable  associated  with  y  has  x  as  value. 

This  form  of  constraint  is  simple  and  can  be  generated  systematically. 

Note  that  redundant  constraints  and  redundant  models  are  orthogonal  con¬ 
cepts  although  their  working  principles  are  similar.  Given  two  models  Mi  = 
{Xi.Txx^Cxf)  and  M2  =  of  a  problem  P.  The  constraints  Cx-^ 

and  Ca'2  have  no  relationship  to  each  other  at  all  since  they  do  not  even  share 
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variables.  Within  a  model,  say  Mi  (or  M2),  we  can  add  a  redundant  constraint 
c  such  that  c  is  entailed  by  Cxi  (or 

3  A  Case  Study 

Our  task  is  to  design  and  implement  a  nursing  staff  rostering  system  for  the 
Ambulance  and  Emergency  Unit  (AEU)  of  the  Tang  Shiu  Kin  Hospital  (TSKH) 
in  Hong  Kong.  The  AEU  provides  daily  24-hour  emergency  services  to  the  general 
public  seven  days  a  week.  Therefore,  AEU  nurses  have  to  work  in  shifts.  The  main 
function  of  the  rostering  system  is  to  roster  the  nurses  in  such  a  way  that  steady 
and  high-quality  services  are  provided  to  the  community,  taking  into  account 
(1)  professional  rules,  (2)  preferential  rules,  (3)  pre-arranged  duties,  and  (4) 
pre-arranged  preferred  shifts.  Other  than  having  to  obey  all  professional  rules, 
fairness  is  an  important  measure  of  the  quality  of  the  generated  roster.  Every 
nurse  must  be  assured  of  equal  chance  in  taking  night  shifts,  having  day-offs  on 
weekends  or  the  actual  public  holidays,  etc,  although  most  fairness  rules  are  in 
the  form  of  soft  constraints.  The  main  difficulties  of  the  process  arise  from  pre¬ 
arranged  duties,  vacation  leaves,  and  pre-arranged  preferred  shifts  requested  by 
the  nurses,  which  often  break  the  regularities  of  wanted  (or  unwanted)  duties. 
In  the  following,  we  give  an  overview  of  the  nurse  rostering  system.  Readers 
interested  in  how  we  handle  fair  rotation  of  want  and  unwanted  shifts  and  soft 
constraints  are  referred  to  [4]. 

There  are  three  basic  shifts  in  a  day:  namely  AM  shift  (A),  PM  shift  (P), 
and  night  shift  (N).  An  evening  shift  (E)  is  essentially  a  PM  shift  with  a  slightly 
different  duty  time.  An  irregular  shift  (I)  has  special  working  hours  either  ar¬ 
ranged  by  the  nursing  officer  or  requested  by  individual  nurses.  Other  shift  types 
concern  holidays  and  special  duties.  Nurses  can  take  several  different  types  of 
holiday.  These  include  day-off  (O),  compensation-off  (CO),  public  holiday  (PH) 
and  vacation  leave  (VL).  In  addition,  nurses  can  be  pre-assigned  some  special 
work  shifts  such  as  study  day  (SD)  and  staff-on-loan  (SOL).  In  total,  there  are 
eleven  shift  types. 

A  weekly  duty  roster  for  week  i  should  be  generated  about  two  weeks  prior 
to  i.  The  rostering  process  assigns,  for  each  nurse,  a  work  shift  for  Monday 
to  Sunday,  taking  into  account  of  the  nurse’s  past  rostering  history.  A  sample 
roster  duty  sheet  is  shown  in  figure  1 .  There  are  two  steps  for  a  duty  planner  to 
complete  a  duty  roster.  First,  the  planner  has  to  collect  information  about  pre¬ 
assigned  shifts  and  shift  requests  from  the  nurses.  Second  the  planner  generates 
the  remaining  shift  slots,  observing  all  planning  rules  and  nurse  preference  rules. 

There  are  two  types  of  planning  rules:  imperative  planning  rules  and  prefer¬ 
ence  planning  rules.  Imperative  planning  rules  are  rules  that  must  be  respected 
in  any  timetable.  Therefore,  in  generating  the  roster,  the  duty  planner  must  en¬ 
sure  that  every  planning  decision  made  is  coherent  with  these  hard  rules.  There 
are  totally  six  imperative  rules.  Some  examples  rules  are: 

Rule  1  Each  staff  is  required  to  work  one  shift  per  day. 
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Fig.  1.  A  Sample  Roster  Duty  Sheet 


Rule  2  Each  staff  get  one  day- off  per  week. 

Preference  planning  rules  are  optional  in  the  sense  that  they  should  be  satis¬ 
fied  as  much  as  possible.  Violating  preference  rules,  however,  does  not  destroy  the 
validity  of  a  roster.  Therefore,  preference  rules  corresponds  to  soft  constraints. 
A  soft  constraint  c  can  be  viewed  as  a  disjunctive  constraint  cU  true.  Following 
Van  Hentenrcyk  [22],  we  model  disjunctive  constraints  as  choices.  While  choice 
is  easy  to  implement  in  a  constraint  programming  language,  accumulation  of  soft 
constraints  can  result  in  a  huge  search  tree.  While  the  preference  rules  are  not 
professional  rules,  they  do  reflect  the  preference  of  nurses  in  general.  Therefore, 
the  number  of  preference  rules  satisfied  is  a  good  measure  of  the  quality  of  a 
generated  roster.  There  are  totally  ten  preference  rules^.  Some  examples  rules 
are: 

Rule  3  A  nurse  should  not  work  for  the  same  shift  for  two  consecutive  days.  In 
particular,  a  nurse  prefer  alternating  A  shift  or  P/E  shift.  If  such  an  arrange¬ 
ment  is  impossible,  a  nurse  also  accept  two  consecutive  A  or  P  shift.  However, 
under  no  circumstance  will  three  consecutive  A  or  P/E  shifts  be  accepted  by  a 
nurse. 

Rule  4  Apart  from  all  holiday  schedule,  the  number  of  A  shifts  and  P  shifts 
allocated  to  each  nurse  each  week  should  be  balanced. 


4  Modeling 

In  this  section,  we  describe  two  models  for  the  nurse  rostering  problem.  The  first 
model  is  based  on  the  format  of  the  roster  sheet,  which  suggests  allocation  of 

Two  of  the  ten  preference  rules  cannot  be  expressed  as  constraints.  They  are  imple¬ 
mented  algorithmically.  The  remaining  eight  rules  are  numbered  C2.1,  C2.2,  C2.3a, 
C2.5,  C2.6,  C2.7,  C2.8,  and  C2.9  respectively.  Interested  readers  cire  referred  to  [4] 
for  details. 
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shifts  to  nurses.  Implementation  of  the  first  model  performs  well  in  general  but 
fails  to  return  answer  in  a  timely  manner  for  some  difficult  cases.  This  prompts 
our  work  on  the  second  model,  which  is  an  allocation  of  nurses  to  shifts.  Im¬ 
plementation  of  the  second  model  again  exhibits  deficiency  on  some  problem  in¬ 
stances  while  performing  well  in  general.  Having  two  mutually  redundant  models 
in  hand,  we  proceed  to  connect  the  models  using  channeling  constraints. 


4.1  Model  One 

In  a  roster  sheet,  each  row  consists  of  seven  slots,  holding  the  work  shifts  assigned 
to  a  nurse  in  a  scheduled  week.  Each  nurse  occupies  a  row  in  the  roster  sheet.  It 
is  thus  natural  to  model  the  slots  Nurse- Day  Of  Week  on  the  sheet  as  constrained 
variables,  each  of  which  is  associated  with  a  domain  of  eleven  possible  shift  types 
{A,P,N,E,I,0,CO,PH,VL,SD,SOL}.  If  there  are  n  nurses  in  the  AEU,  then  there 
will  be  7n  variables  in  the  rostering  system. 

Under  this  formulation,  we  have  rule  1  satisfied  for  free  since  the  rule  is 
implicitly  satisfied  by  requiring  each  constrained  variable  to  take  on  exactly  one 
value.  Rule  2  can  be  modeled  using  a  counting  constraint  [11]  as  follows: 

For  each  nurse  Nurse,  the  number  of  variables  in  the  set 

{Nurse-Mon,  Nurse-Tue, . . . ,  Nurse-Sun} 

assigned  with  the  day-off  (0)  shift  must  be  equal  to  one. 

Preference  planning  rules,  or  soft  constraints,  are  expressed  in  the  same  man¬ 
ner  as  imperative  planning  rules.  There  is,  however,  one  significant  difference  in 
how  they  are  posted  to  the  constraint-solving  engine.  For  each  soft  constraint 
c,  we  set  up  a  choice  point.  In  the  first  branch,  the  constraint  c  is  told  to  the 
solver.  The  other  branch,  one  without  c,  is  tried  upon  backtracking. 


4.2  Model  Two 

Model  two  regards  the  rostering  process  as  assigning  nurses  to  serve  in  the  eleven 
shifts  in  each  day  of  a  week.  Since  there  may  be  more  than  one  nurse  working 
in  one  shift  of  a  day,  the  variables  take  on  sets  of  nurses  as  values.  This  kind  of 
variables  are  called  constrained  set  variables  [11,9].  Thus,  we  model  each  shift  of 
a  day  Shift -Day  Of  Week  as  constrained  variables,  each  of  which  has  as  domains 
the  power  set  of  the  set  of  all  nurses.  Regardless  of  the  number  of  nurses  in  the 
AEU,  the  rostering  system  contains  77  (11  x  7)  variables.  If  there  are  n  nurses 
in  the  AEU,  the  size  of  the  domain  of  each  variable  is  2^. 

The  expression  of  the  planning  rules  is  now  based  on  set  operations  and 
constraints.  For  example,  rule  1  is  now  expressed  as  the  constraints: 

For  each  day  Day,  consider  the  set  of  variables 

V  =  {A-Day,  P-Day,  N-Day,  E-Day, . . . ,  VL-Day,  SD-Day,  SOL-Day}. 
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The  following  two  constraints  must  be  satisfied:  (1)  Elements  of  V  are 
pairwise  disjoint  and  (2)  the  cardinality  of[jV  is  equal  to  the  number 
of  nurses. 

Similarly,  rule  2  is  expressed  as: 

Consider  the  set 


F  =  {0-Mon,  O-Tue, . . . ,  O-Sun}. 

The  following  two  constraints  must  be  satisfied:  (1)  Elements  of  V  are 
pairwise  disjoint  and  (2)  the  cardinality  of[jV  is  equal  to  the  number 
of  nurses. 

Again,  the  soft  constraints  in  this  model  is  represented  similarly  as  imperative 
constraints,  and  are  posted  to  the  solving  engine  using  choice  points. 


4.3  Combining  the  Models 

The  combined  model  contains  model  one  and  model  two  as  sub-models.  In  addi¬ 
tion,  a  set  of  channeling  constraints  are  used  to  relate  variables  in  the  two  models 
so  that  constraints  can  be  propagated  between  the  two  models.  The  channeling 
constraints  are  of  the  following  form: 

Nurse-DayOfWeek  ==  Shift  if  and  only  if  "Nurse  G  Shift-DayOfWeek. 

It  is  worth  noting  that  the  constraints  are  generated  mechanically  using  a  simple 
double  for-loop  in  our  implementation.  In  general,  there  are  7mn  channeling 
constraints  for  m  nurses  and  n  shift  types  for  a  weekly  roster. 

5  Implementation  and  Preliminary  Results 

To  verify  the  correctness  and  effectiveness  of  our  modeling,  we  have  implemented 
the  three  models  on  personal  computer  (PC),  which  is  chosen  over  workstation 
for  PC’s  wide  availability.  Our  prototypes  consist  of  a  user-interface  and  a  ros¬ 
tering  engine.  The  former  is  implemented  in  Microsoft  Visual  Basic  3.0  while  the 
latter  is  realized  in  C-1— |-  with  ILOG  Solver  library  3.0  [11].  Class  constraints  [11] 
of  ILOG  Solver  facilitate  our  object-oriented  design  and  greatly  reduce  our  cod¬ 
ing  effort. 

Implementations  of  model  one,  model  two  and  the  combined  model  consist 
of  6015,  4523,  and  8656  lines  of  C-|— |-  code  respectively.  This  little  difference  in 
code  size  is  a  result  of  the  fact  that  constraint  expressions  occupy  relatively  few 
lines  as  compared  to  coding  for  I/O,  data  structure  definition,  etc.  Therefore, 
much  coding  can  be  shared  and  combined  in  implementing  the  combined  model, 
resulting  in  no  significant  increase  in  code  size.  Most  important  of  all,  we  im¬ 
plement  model  two  and  the  combined  model  in  four  man-weeks,  whereas  model 
one  is  implemented  in  four  man-months.  It  takes  longer  time  in  the  first  imple¬ 
mentation  since  we  spend  much  time  in  problem  understanding,  correspondence 
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with  users,  experimentation  with  ideas  and  modification.  The  time  used  for  the 
second  and  third  implementations  is  purely  devoted  to  coding. 

Our  real-life  benchmark  data  contains  27  nurses  and  11  shift  types.  The 
rostering  system  produces  weekly  roster  sheet^.  The  implementation  of  model 
one  generates  3884  constraints,  out  of  which  266  are  imperative,  during  execu¬ 
tion.  Executing  the  model  two  implementation  results  in  1827  constraints,  out 
of  which  1675  are  imperative.  These  unbalanced  figures  reflect  two  phenomena. 
First,  some  types  of  constraints  are  more  expressive  and  concise  than  others.  Sec¬ 
ond,  some  planning  rules  that  are  easier  to  express  in  one  model  can  be  tedious  to 
formulate  in  another  model.  Finally,  the  combined  model  consists  of,  in  addition 
to  the  constraints  from  models  one  and  two,  2079  channeling  constraints. 

We  employ  the  simple  smallest-domain-first  variable-ordering  heuristics  and 
the  smallest-first  value-ordering  heuristics  in  the  variable  labeling  process.  In 
most  case,  the  prototypes  can  return  a  solution  within  5  seconds  running  on  a 
Pentium  PC.  In  the  following,  we  present  the  timing  result  of  the  three  models 
for  a  particularly  difficult  problem  instance.  Figure  2  shows  the  preset  requests 
of  this  particular  week’s  roster,  which  should  partially  explain  why  this  problem 
instance  is  difficult.  In  this  week’s  roster  assignment,  17  out  of  the  27  nurses 
request  for  preset  shifts.  Among  them,  5  nurses  request  the  day-off  (O)  shift  on 
Sunday,  which  is  most  wanted  by  other  nurses  as  well.  2  nurses  request  vacation 
leave  (VL),  which  extends  across  the  entire  week.  Last  but  not  least,  one  nurse 
is  assigned  special  duty  (SD)  from  Monday  to  Friday  and  requests  for  O  on 
Sunday.  As  a  result,  47  out  of  189  slots  are  preset  before  the  roster  can  be 
generated.  Some  of  the  preset  slots  are  filled  in  with  the  most  undesirable  (from 
the  rostering  point  of  view)  patterns,  making  this  week’s  roster  exceptionally 
hard  to  generate. 

The  number  of  preference  rules  posted  and  the  nature  of  the  preference  rules 
posted  also  affects  the  difficulty  of  the  problem.  For  the  same  set  of  preset 
requests,  we  vary  the  combination  of  preference  rules  posted  and  show  the  per¬ 
formance  of  the  three  models  in  figure  3.  Column  one  of  each  row  specifies  the 
combination  of  preference  rules  imposed.  Column  two  and  three  contains  the 
timings  of  models  one  and  two  respectively.  The  string  “short”  means  that  exe¬ 
cution  finishes  within  1  minute.  The  string  “fast”  means  an  almost  instantaneous 
response  and  “long”  means  that  the  system  does  not  return  an  answer  within 
30  mins.  Since  the  combined  model  contain  models  one  and  two  as  sub- models, 
labeling  variables  of  either  model  will  instantiate  also  variables  in  the  other 
model.  Column  four  shows  the  timing  result  of  the  combined  model  by  labeling 
only  variables  in  model  one.  Alternatively,  we  can  label  only  variables  in  model 
two  and  yield  the  results  in  column  five.  Experimental  results  confirm  that  the 
combined  model  exhibits  significant  speedup  over  either  model  one  or  two. 


^  Generating  more  than  one  week’s  roster  can  be  achieved  by  running  the  system 
multiple  times. 
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Fig.  2.  A  Difficult  Rostering  Problem 


6  Concluding  Remarks 

The  contribution  of  this  paper  is  three-fold.  First,  we  define  formally  the  notions 
of  modeling  and  model  redundancy  of  constraint  satisfaction  problems.  These 
definitions  serve  as  the  foundation  for  the  systematic  study  of  the  use  of  redun¬ 
dant  modeling  to  speed  up  constraint-solving.  Second,  we  introduce  channeling 
constraints  for  combining  mutually  redundant  models  of  the  same  problem.  We 
further  suggest  guidelines  for  constructing  alternate  models  for  a  problem  and 
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for  how  the  models  can  be  connected  by  a  form  of  channeling  constraints.  In 
the  nurse  rostering  application,  the  channeling  constraints  are  generated  me¬ 
chanically,  if  not  automatically,  using  a  simple  double  f  or-loop.  Third,  we  apply 
our  method  in  a  real-life  nurse  rostering  problem  and  verify  empirically  that 
the  combined  model  does  exhibit  significant  speedup  over  either  of  the  mutually 
redundant  models. 

We  believe  that  the  redundant  modeling  method  will  be  a  valuable  tool  for 
the  constraint  community.  The  method,  although  simple,  is  systematic  enough, 
allowing  programmers  to  generate  channeling  constraints  almost  mechanically. 
Our  approach  also  requires  no  special  insight  into  the  problem  domain  or  the 
problem  itself.  This  is  not  true  for  the  case  of  introducing  redundant  constraints. 
Redundant  models  and  channeling  constraints  are  different  from  redundant  con¬ 
straint  although  they  are  based  on  a  similar  working  principle.  Programmers 
can  still  exploit  full  knowledge  of  the  problem  or  the  problem  domain  to  inject 
redundant  constraints  into  one  or  all  of  the  redundant  models  to  further  speed 
up  constraint-solving.  Another  nice  property  of  our  method  is  that  it  implies  no 
modification  to  the  underlying  constraint  solver  or  labeling  heuristics. 

A  potential  problem  of  our  method  is  the  memory  overhead  introduced  by  the 
accommodation  of  more  than  one  model  (variables  and  constraints)  of  a  problem. 
There  is  always  a  tradeoff  between  time  and  space  in  the  design  of  algorithms. 


102 


The  choice  is  always  dictated  by  either  limitation  on  hardware  or  urgency  for 
time-efficiency.  With  the  advent  of  cheaper  and  more  massive  memory,  we  do  not 
foresee  memory  overhead  as  a  major  obstacle  for  the  adoption  of  our  method. 

Results  in  this  paper  are  preliminary.  Many  open  problems  remain.  We  defi¬ 
nitely  need  more  experience  in  applying  the  method  to  other  real-life  problems. 
In  particular,  we  should  investigate  automatic  or  semi-automatic  method  of  cre¬ 
ating  alternate  models  from  an  existing  model.  Such  exercise  is  useful  even  if 
we  can  automate  model  generation  for  only  problems  of  a  restricted  problem 
domain.  Another  open  problem  is  the  code  maintenance  of  the  combined  model: 
how  modifications  in  one  model  can  be  “propagated”  to  the  other  models  auto¬ 
matically.  To  solve  these  problems,  we  may  benefit  from  a  formal  model  descrip¬ 
tion  language  or  notation  so  that  models  can  be  described  fully  and  formally. 
Such  a  language  helps  in  extracting  and  formally  reasoning  with  properties  of 
models. 

On  the  empirical  side,  it  is  also  interesting  to  study  and  experiment  with 
variable-ordering  heuristics  that  label  variables  in  the  sub-models  alternatively. 
Another  important  direction  of  work  is  to  understand  and  analyze  the  propa¬ 
gation  behaviour  of  constraints  among  sub-models  in  a  combined  model.  One 
conjecture  is  that  propagation  among  sub-models  achieves  a  higher  consistency 
level  of  the  combined  network. 
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Abstract.  In  this  paper,  we  formulate  train  rescheduling  as  constraint 
satisfaction  problem  and  describe  a  constraint  propagation  approach  to 
tackle  it.  Algorithms  for  timetable  verifications  and  train  rescheduling  are 
designed  imder  a  coherent  framework.  We  define  two  optimality  criteria 
that  correspond  to  minimizing  passenger  delay  and  the  number  of  station 
visit  modifications  respectively  for  rescheduling.  Two  heuristics  are  then 
proposed  to  speed  up  and  direct  the  search  towards  the  optimal  solutions. 
The  feasibility  of  our  proposed  algorithms  and  heuristics  are  confirmed 
with  experimentation  using  real-life  data. 
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1  Introduction 

The  PRaCoSy  (People’s  Republic  of  China  Railway  Computing  System) 
project  [10]  is  undertaken  by  the  International  Institute  for  Software  Technol¬ 
ogy,  United  Nations  University  (UNU/IIST).  The  aim  of  the  project  is  to  de¬ 
velop  skills  in  software  engineering  for  automation  in  the  Chinese  Railways.  A 
specific  goal  of  the  project  is  the  automation  of  the  preparation  and  updating 
of  the  running  map^ ,  for  dispatching  trains  along  the  600  kilometer  long  rail¬ 
way  line  between  Zhengzhou  and  Wuhan  in  the  People’s  Republic  of  China. 
The  Zhengzhou  to  Wuhan  section  has  been  chosen  as  a  case  study  because  it 
is  along  the  busy  Beijing- Guangzhou  line,  the  arterial  north-south  railway  in 
China.  The  rate  of  running  trains,  both  goods  and  passengers,  of  this  section  is 
high  and  present  management  procedures  are  not  adequate  with  the  dramatic 
development  of  domestic  economy. 

A  running  map  [9]  contains  information  regarding  the  topology  of  the  railway, 
train  number  and  classification,  arrival  and  departure  time  of  trains  at  each  sta¬ 
tion,  arrival  and  departure  paths,  etc.  A  computerized  running  map  tool  should 
read  in  stations  and  lines  definition  from  a  descriptor  file,  allow  segments  (sub¬ 
sets  of  all  stations)  and  time  intervals  to  be  defined,  allow  train  timetable  to 

^  A  nmning  map  is  a  method  of  monitoring  the  movement  of  trains  and  rescheduling 
their  arrivals  and  departures  to  satisfy  operational  constraints. 
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be  read,  and  finally  display  graphically  the  projection  of  the  timetable  against 
a  given  segment  and  a  given  interval,  A  sample  running  map  is  shown  in  fig¬ 
ure  6.  Train  dispatchers,  users  of  the  tool,  have  to  modify  the  timetable  when 
trains  in  some  sections  cannot  run  according  to  the  map,  possibly  due  to  acci¬ 
dents  and/or  train  delays.  The  modification  to  the  map  should  be  performed  in 
such  a  way  that  certain  scheduling  rules  (laid  down  by  the  local  railway  bureau) 
are  not  violated.  Therefore,  a  computer  running  map  tool  should  check  users’ 
modifications  against  possible  violation  of  scheduling  rules,  and  warn  users  of 
such  violations.  In  addition,  the  tool  should  also  assist  the  user  in  repairing, 
either  automatically  or  semi-automatically,  an  infeasible  timetable  so  that  the 
least  train  service  disruption  is  made.  We  call  this  process  rescheduling.  Schedul¬ 
ing  and  rescheduling  are  different  in  two  aspects.  First,  while  scheduling  creates 
a  timetable  from  scratch,  rescheduling  assumes  a  feasible  timetable  and  user 
modifications,  which  may  introduce  inconsistencies  to  the  timetable,  as  input. 
Second,  optimality  criteria  used  in  scheduling,  such  as  minimum  operating  cost, 
are  usually  defined  in  the  absolute  sense.  In  rescheduling,  however,  the  quality 
of  the  output  is  measured  with  respect  to  the  original  timetable. 

The  PRaCoSy  project  has  resulted  in  a  running  map  tool  capable  of  train 
timetable  verification  [7].  Our  task  at  hand  is  to  enhance  the  PRaCoSy  tool 
to  perform  automatic  rescheduling,  which  can  be  considered  as  constraint  re¬ 
satisfaction.  A  major  problem  with  the  PRaCoSy  implementation  is  that  con¬ 
straints  are  used  only  passively  to  test  possible  violation  of  scheduling  rules.  In 
view  of  this  limitation,  we  decided  to  re-create  the  running  map  tool  from  scratch 
using  a  constraint  programming  approach.  In  this  paper,  we  give  algorithms  for 
timetable  verification  and  train  rescheduling  used  in  our  tool,  and  show  that  con¬ 
straint  programming  allows  us  to  perform  constraint  checking  and  solving  (or 
propagation)  in  a  coherent  framework.  We  study  two  notions  of  optimality  for 
the  rescheduled  timetable  with  respect  to  the  original  timetable.  These  notions 
provide  a  measure  of  the  quality  of  the  rescheduling  operation.  We  also  present 
two  heuristics  that  direct  and  speed  up  the  search  towards  optimal  rescheduled 
timetables. 

The  rest  of  the  paper  is  organized  as  follows.  Section  2  defines  basic  termi¬ 
nology  and  discusses  related  work.  Section  3  explains  the  timetable  verification 
algorithm.  In  section  4,  we  show  how  to  formulate  train  rescheduling  as  a  con¬ 
straint  satisfaction  problem  and  give  an  associated  algorithm.  We  also  discuss 
two  heuristics  that  help  to  direct  and  speed  up  the  search  towards  optimal  solu¬ 
tions.  In  section  5,  we  describe  our  prototype  implementation  and  sample  runs 
of  the  tool.  We  summarize  our  contribution  and  shed  light  on  future  work  in 
section  6. 

2  Preliminaries 

In  the  following,  we  provide  informal  definitions  of  necessary  terminology  ac¬ 
cording  to  [9]  to  facilitate  subsequent  discussions.  Names  not  defined  should  be 
self-explanatory  or  clear  from  the  context.  Interested  readers  can  refer  to  [9]  for 
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formal  definitions  of  the  same  terminology. 

The  topology  of  a  railway  system  is  defined  by  a  collection  of  named  stations 
and  identified  lines.  We  differentiate  between  the  lines  within  a  station  and  those 
connecting  two  stations  by  referring  the  former  as  lines  and  the  latter  as  tracks. 
A  train  journey  is  a  sequence  of  visits  to  connected  stations.  Each  visit  to  a 
station  is  represented  by  the  arrival  time  and  the  departure  time  of  the  train.  A 
timetable  is  an  association  from  trains  to  journeys  to  be  made.  Scheduling  rules 
is  a  set  of  temporal  constraints  to  restrict  the  arrival  time  and  departure  time  of 
each  visit  in  order  to  prevent  such  undesirable  events  as  train  crash.  A  timetable 
is  valid  (or  feasible)  if  no  scheduling  rules  are  violated  under  the  associations. 
Otherwise,  the  timetable  is  invalid  (or  infeasible ). 

Given  a  feasible  timetable  with  n  station  visits,  which  is  represented  by  a 
set  of  assignments  (or  equality  constraints)  of  the  form  ATi  —  tf  and  DTi  =  if, 
where  ATj  and  DTi  denote  the  arrival  and  departure  time  respectively,  for  1  < 
i  <  n,  the  modifications  that  we  can  make  to  the  timetable  are  to  replace  some 
assignments  ATj  =  tj  (or  DTj  =tj)  by  ATj  =  t’f  (or  DTj  =  where  t^  <  Tf- 
(or  tj  <  for  j  E  {1 . .  .n}.  In  other  words,  we  can  only  delay  the  arrival  time 
or  departure  time  of  visits.  Given  an  infeasible  timetable,  rescheduling  is  the 
process  of  modifying  the  timetable  so  as  to  make  the  timetable  feasible. 

2.1  Problem  Statement 

There  are  six  types  of  scheduling  rules  [9]  in  our  railway  system:  the  speed  rule, 
the  station  occupancy  rule,  the  station  entry  rule,  the  station  exit  rule,  the  line 
time  rule,  and  the  stopover  rule.  Let  there  be  two  trains  1  and  2  and  two  adjacent 
stations  A  and  B.  The  variables  ATxy  and  DTxy  denote  train  T’s  arrival  and 
departure  time  at /from  station  X  respectively.  The  above  scheduling  rules  can 
be  formulated  as  the  following  types  of  scheduling  constraints. 

The  Speed  Constraint 

[lngl{ATAi-DTBi)) 

The  constant  Ing  denotes  the  distance  between 
constraint  enforces  that  the  average  train  speed 
tions  cannot  exceed  sp. 

The  Station  Occupancy  Constraint 

{DTa2  +  ctr  <  ATai)  V  {DTai  +  ctr  <  AT  at) 

This  constraint  enforces  that  there  is  at  least  ctr  time  units  between  two  trains 
occupying  a  track. 

The  Station  Entry  Constraint 

{ATai  ~  ATat  ^  cen)  V  {ATat  ~  ATai  ^  cen) 

This  constraint  enforces  that  there  is  at  least  cen  time  units  between  two  trains 
entering  a  station  via  a  line. 


<  sp 

station  A  and  station  B.  This 
traveling  between  the  two  sta- 
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The  Station  Exit  Constraint 

[DTai  —  DTa2  ^  ceaj)  V  {DTa2  ~  DTa\  >  cex) 

This  constraint  enforces  that  there  is  at  least  cex  time  units  between  two  trains 
departing  from  a  station  via  a  line. 

The  Line  Time  Constraints 

[{DTbi  <  DTb2)  a  {ATai  <  ATa2))  V  ({DTbi  >  DTb2)  A  (^IT^i  >  ^^^12)) 
ATbi  <  {DTb2  ~  cenx)  V  ATa2  <  {DTai  cenx) 

The  line  time  rule  is  split  into  two  constraints.  The  first  constraint  enforces  that 
no  train  overtakes  another  train  if  they  are  traveling  in  the  same  direction  on  a 
line.  The  second  constraint  enforces  that  if  there  are  two  journeys  on  a  line  in 
opposing  directions,  the  line  must  be  unoccupied  for  at  least  cenx  time  units. 

The  Stopover  Constraint 


DTai  -  ATa\  >  cst 

This  constraint  enforces  that  a  train  will  stay  in  a  station  for  at  least  cst  time 
units. 

Given  the  topology  of  a  railway  system  with  a  valid  train  timetable,  due  to 
unexpected  events,  the  users  of  the  running  map  tool  may  want  to  modify  the 
timetable.  Our  work  is  to  first  check  the  feasibility  of  the  modified  timetable. 
If  it  is  feasible,  the  previous  timetable  is  replaced  by  the  modified  one.  Other¬ 
wise,  we  reschedule  the  infeasible  timetable  to  generate  a  new  feasible  timetable. 
Note  that  efficiency  should  be  a  critical  concern  in  designing  the  verification 
and  rescheduling  algorithms  since  in  real-life  situations,  rescheduling  must  be 
performed  in  a  timely  manner.  The  notion  of  “efficiency”  may  vary  according  to 
situations.  Ten  minutes,  however,  should  be  a  tolerable  bound  in  general  [2]. 

Optimal  solutions  are  not  required  usually.  In  most  cases,  it  is  impractical 
to  generate  optimal  solutions  within  a  given  (usually  small)  time  bound.  Cri¬ 
teria  for  optimality,  however,  should  be  defined.  Such  definitions  can  serve  as 
guidelines  for  designing  various  variable-ordering  and  value-ordering  heuristics 
to  generate  “good”  answers.  A  precise  notion  of  optimality  also  enables  us  to 
measure  the  “quality”  of  the  rescheduled  timetable.  In  the  following,  we  present 
two  optimality  criteria. 

A  rescheduled  timetable  is  minimum- changes  optimal  with  respect  to  the 
original  timetable  if  the  least  number  of  station  visits  are  modified.  This  criterion 
can  be  satisfied  easily  in  general  since  in  most  cases,  we  can  simply  delay  the 
trains  in  question  to  the  latest  possible  time.  The  resulting  timetable,  however, 
may  introduce  unreasonable  long  delay  to  some  train  visits.  Thus  this  criterion 
should  usually  be  applied  with  other  criteria  limiting  the  maximum  delay. 

A  rescheduled  timetable  is  minimum-delay  optimal  with  respect  to  the  orig¬ 
inal  timetable  if  the  longest  delay  among  all  train  visits  is  minimum.  Let  the 
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tuples  {ATi ,  DTi , . , . ,  ATn ,  DTn )  and  (ATI  ^  >  •  •  • » 5  )  denote  the  in¬ 

feasible  and  the  rescheduled  timetable  respectively.  The  goal  of  this  criterion  is 
to  minimize  the  following  expression: 

max(AT;  -  ATuDTi  -  DTi, . . . ,  AT'  -  AT„,  DT'  -  DTn). 

The  aims  of  the  two  criteria  could  contradict  one  another  and  represent  the 
extremes  of  a  spectrum  of  other  possible  definitions  of  optimality. 

2.2  Related  Work 

Rescheduling  is  different  from  traditional  scheduling  in  the  sense  that  the  pos¬ 
sible  solutions  of  rescheduling  are  restricted  by  the  original  schedule.  Zweben  et 
al  [11]  tackle  this  problem  using  constraint-based  iterative  repair  with  heuristics. 
The  resultant  GERRY  scheduling  and  rescheduling  system  is  applied  to  coor¬ 
dinate  Space  Shuttle  Ground  Processing.  Our  work  is  based  on  a  propagation- 
based  constraint  solver. 

Somewhat  related  to  our  work  is  train  scheduling.  Komaya  and  Fukuda  [5] 
propose  a  problem  solving  architecture  for  knowledge-based  integration  of  sim¬ 
ulation  and  scheduling.  Two  train  scheduling  systems  are  designed  in  this  ar¬ 
chitecture.  Fukumori  et  al  [3]  use  the  tree  search  and  constraint  propagation 
technique  with  the  concepts  of  time  belt  in  their  scheduling  system.  This  ap¬ 
proach  is  claimed  to  be  suitable  for  double-track  line  and  continuous  time  unit. 
Recently,  Chiang  and  Hau  [1]  attempt  to  combine  repair  heuristic  with  several 
search  methods  to  tackle  scheduling  problems  for  general  railway  systems. 

There  are  two  on-going  projects  that  aim  at  automating  train  scheduling  for 
real-life  railway  ministries.  Our  work  is  a  direct  outgrowth  of  the  PRaCoSy 
project  [10]  at  UNU/IIST.  The  latest  PRaCoSy  running  map  tool  prototype 
uses  constraints  only  passively  to  test  for  constraint  violation  in  their  verification 
engine.  The  Train  Scheduling  System  (TSS)  designed  for  Taiwan  Railway  Bureau 
(TRB)  [6]  is  a  knowledge-based  interactive  train  scheduling  system  incorporating 
both  an  automatic  and  a  manual  schedulers.  Users  and  the  computer  system  are 
thus  able  to  bring  complementary  skills  to  the  scheduling  tasks. 

3  Timetable  Verification 

In  the  following,  we  describe  a  timetable  verification  algorithm,  which  examines  if 
a  given  timetable  is  valid  with  respect  to  a  set  of  scheduling  constraints.  Violated 
scheduling  rules  (or  constraints)  in  an  invalid  timetable  will  be  located  and 
displayed  to  the  user.  The  algorithm,  shown  in  figure  1,  assumes  the  existence 
of  a  propagation-based  constraint  solver  [8]  propagateO. 

The  timetable  T  can  be  viewed  as  a  set  of  constraints  for  all  variables  in 
either  the  form  ATi  =  if  ov  DTi  =  tf,  where  ATi  is  an  arrival  time  and  DTi 
is  a  departure  time.  After  all  variables  become  ground  (line  5),  we  post  the 
scheduling  constraints  one  by  one  (lines  7-13).  Having  all  variables  ground,  the 
propagateO  engine  performs  essentially  constraint  checking.  It  is  easy  to  check 
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procedure  verif y(in  (7,  T,  out  /) 

/*  C:  scheduling  constraints,  T:  timetable,  I:  violated  constraints  */ 

/*  Initialization  */ 

S  propagate(T)  /*  Constraint  store  S  is  initialized  to  T  */ 

/*  Constraint  Verification  */ 
for  each  c  €  <7 

5  -f-  propagate(5  U  c) 
if  inconsistency  found 
/  /  U  {c} 

5  -f-  5'\{c}  /*  Retract  c  from  the  constraint  network  */ 

endif 

endfor 

end 


Fig.  1.  The  Timetable  Verification  Algorithm 


that  a  constraint  is  violated  under  the  given  timetable  if  inconsistency  is  found 
after  the  constraint  is  told  to  the  store.  Violated  constraints  are  retracted  so 
that  the  algorithm  can  proceed  to  check  for  other  possible  constraint  violations. 
Again,  the  groundness  of  all  variables  allows  us  to  retract  a  constraint  by  simply 
removing  the  undesirable  constraint  from  the  constraint  store. 


4  Rescheduling  as  Constraint  Satisfaction 

Scheduling  is  an  instance  of  constraint  satisfaction  problem.  In  the  following, 
we  show  the  same  for  rescheduling.  Given  a  timetable  T.  Users  modify  T  by 
adjusting  its  arrival  and  departure  times,  obtaining  T',  which  can  be  valid  or 
invalid.  If  T'  is  invalid,  the  rescheduling  process  should  attempts  to  repair  T'  to 
make  it  feasible.  By  repairing,  we  mean  adjusting  the  value  of  the  non-modified 
variables  so  that  (1)  the  timetable  becomes  valid  again  and  (2)  the  new  timetable 
should  be  reasonably  “close”  to  the  original  timetable  T.  By  being  close  to  T,  we 
mean  that  the  new  timetable  should  create  the  least  service  disruptions.  Example 
optimality  criteria  are  given  in  section  2.  Note  that  user  modified  variables  must 
be  kept  fixed  during  the  rescheduling  process  since  the  modifications  represent 
dispatcher  requirements. 

In  order  to  formulate  rescheduling  as  constraint  satisfaction,  we  have  to  de¬ 
termine  the  variables  of  the  problem,  the  domains  associated  with  the  variables, 
and  the  constraints  of  the  problem.  In  rescheduling,  the  variables  are  the  arrival 
and  departure  time  of  the  timetable.  Every  variable  share  the  integer  domain 
{0, . . . ,  1439}^.  There  are  three  types  of  constraints  in  the  rescheduling  problem. 

1.  Scheduling  constraints:  The  scheduling  constraints  set  forth  in  section  2. 


^  There  are  1440  minutes  in  24  hours. 
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2.  Modification  constraints:  For  each  arrival  or  departure  time  X  which  is  mod¬ 
ified  by  the  user  to  new  value  t,  we  have  the  equality  constraint  X  =t.  This 
constraint  enforces  the  user  modifications  to  stay  fixed  during  rescheduling. 

3.  Forward-labeling  constraints:  For  each  non-modified  variable  X  with  value 
t  in  the  original  timetable  T,  we  have  the  constraint  X  >t.  This  constraint 
is  necessary  to  ensure  that  we  can  only  delay  arrival  or  departure  time. 

Rescheduling  now  becomes  finding  a  solution  to  the  above  constraint  satisfac¬ 
tion  problem.  A  solution  is  optimal  if  the  solution  is  “closest”  to  the  original 
timetable. 

We  are  now  ready  to  present  the  rescheduling  algorithm,  shown  in  figures  2 
and  3.  The  algorithm  can  be  divided  into  three  phases.  In  phase  one  (line  4),  we 


1 

2 

3 

4 

5 

6 

7 

8 

9 

10 
11 
12 

13 

14 

15 


procedure  reschedule(in  C,  T) 

/*  C:  scheduling  constraints,  T:  feasible  timetable  */ 

/*  Initialization  */ 

f-  propagate(C')  /*  Save  a  copy  of  the  constraint  store  after 

propagating  constraints  in  C  */ 

/*  Rescheduling  */ 
while  true 

Read  in  user  modifications  U  /*  U  is  in  the  same  form  as  T  * / 
modify (T,  -So,  U,  R) 

ifR=fail  /*No  feasible  timetable  */ 

Prompt  error  messages 

else 

T  <-  R  /*  Display  rescheduled  timetable  if  necessary  */ 

endif 

endwhile 


Fig.  2.  The  Train  Rescheduling  Algorithm 


post  and  propagate  all  scheduling  constraints  to  prune  infeasible  values  in  the 
variables.  The  pruned  constraint  network  is  then  saved  in  Sq.  Since  the  schedul¬ 
ing  constraints  are  the  same  for  any  timetable  T  and  user  modifications  U,  the 
same  store  -So  will  be  reused  in  every  rescheduling  step.  In  real-life  situation, 
rescheduling  has  to  be  performed  repeatedly  for  different  timetables  and  user 
modifications,  this  saving  operation  helps  to  avoid  unnecessary  invocations  of 
constraint  relaxation. 

Actual  rescheduling  takes  place  in  the  procedure  modify ()  (lines  16-46).  In 
the  second  phase  (lines  19-35)  of  rescheduling,  information  is  extracted  from  user 
modifications  and  the  original  timetable  to  post  and  propagate  the  modification 
constraints  (lines  23-27)  and  the  forward-labeling  constraints  (lines  28-35).  If 
inconsistency  is  found,  rescheduling  is  halted  and  failure  is  reported.  In  the  third 
phase  (lines  36-46),  variables  that  are  not  modified  by  the  users  (extracted  by 
the  varsO  function)  are  enumerated  or  labeled  using  some  form  of  variable- 
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procedure  modif y(in  T,  So,  U,  out  R) 

/*  T:  previous  feasible  timetable,  5o:  saved  constraint  network  state, 

U :  user  modifications,  R:  rescheduled  timetable  */ 

/*  Initialization  */ 

o  ^  {X  =^t\(X  ^t)  eT,X  e  vars(?7)} 

/*  Assignment  constraints  associating  with  user-modified  variables  */ 
I*  Post  all  user  modifications  */ 

S  <-  propagate(S'o  U  U)  /*  First  and  second  type  of  constraints  */ 
if  inconsistency  found  /*  User  modifications  are  inconsistent  */ 

R  <—  fail 

return 

endif 

/*  Prune  infeasible  values:  can  only  delay  arrival  &  departure  time  */ 
for  constraint  [X  =  t)  £  T\0 

S  <r-  propagate(5  U  (A  >  t))  /*  Third  type  of  constraints  */ 

if  inconsistency  found 
R  f-  fail 
return 

endif 

endfor 

/*  Rescheduhng  */ 

E  <-  vars(T\0)  /*  Set  of  variables  for  rescheduling  */ 

A  —  label  ing(£‘) 

/*  Appropriate  variable-  and  value-ordering  should  be  used. 

The  function  labeling ()  returns  either  {}  or  a  set  of  equahty 
constraints  (or  bindings)  for  each  variable  X  £  E  / 

ifyi  =  {} 

R  f-  fail 

else 

R£-UUA 

endif 

Fig.  3.  The  Train  Rescheduling  Algorithm  (cont.) 


and  value-ordering  heuristics,  ^vhich  are  embedded  in  the  label ing()  function, 
to  speed  up  and  direct  the  search  towards  a  near-optimal  solution. 

There  are  two  situations  in  which  user  modifications  lead  to  a  timetable  T' 
that  is  non-repair  able.  First,  the  user  modifications  are  self  conflicting.  Since 
user  modified  variables  must  be  kept  fixed  during  rescheduling,  it  is  impossible 
to  repair  other  variables  to  make  the  timetable  valid.  Second,  user  modifications 
are  not  self  conflicting  but  there  is  no  room  for  other  variables  to  adjust  to  make 
the  timetable  valid.  Constraint  propagation  algorithms  are  well-known  to  be 
incomplete  [8].  Thus,  phase  two  of  the  rescheduling  algorithm  can  detect  some, 
but  not  all,  of  this  kind  of  conflicts.  Theoretically  speaking,  the  enumerating 
procedure  in  phase  three  can  guarantee  to  detect  inconsistency  but  it  would 
usually  take  impractically  long  to  do  so.  In  cases  when  the  rescheduling  algorithm 
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fails  to  return  an  answer  within  a  few  minutes,  users  are  advised  to  abort  the 
current  computation,  re-adjust  the  modifications,  and  restart  the  rescheduling 
process. 

In  the  rest  of  this  section,  we  present  two  variable  labeling  heuristics,  which 
are  designed  to  yield  rescheduled  timetables  in  the  minimum-delay  and  the 
minimum-change  optimal  sense  respectively. 

Smallest-First  Pnncip/e  Variables  are  ordered  in  the  ascending  order  of  the 
lower  bound  of  their  domains.  Values  in  the  variable  domains  are  also  enu¬ 
merated  in  ascending  order.  This  principle  is  founded  on  the  assumption  that  a 
short  delay  on  a  train  visit  will  cause  short  delay  on  the  subsequent  one,  which 
means  that  delays  propagate  in  a  monotonic  fashion.  Experimental  results  con¬ 
firm  that,  using  actual  timetables  from  PRaCoSy,  this  heuristic  usually  helps 
to  generate  solutions  that  are  minimum-delay  optimal  efficiently.  We  construct 
below  an  unrealistic  artificial  example  that  defeats  the  heuristic. 

Figure  4  (a)  shows  a  small  segment  of  four  journeys  on  a  railway  running 


Nanjing 


Longtan 

1:00  2:00  3:00  -1:00  1:00  2:00  3:00  4:00 

(a)  Before  Rescheduling  (b)  After  Rescheduling 


Fig.  4.  A  Non-Optimal  Solution  by  Smallest-First  Heuristic 


map.  The  three  journeys  A,  B,  and  C  share  the  same  track  in  the  Nanjing 
station,  while  the  two  journeys  C  and  D  take  the  same  line  in  traveling  from 
Nanjing  to  Longtan.  The  journeys  A  and  D  are  fixed  (indicated  by  thick  lines) 
by  the  users.  Suppose  the  station  occupancy  and  the  station  exit  rules  enforces 
that  at  least  ten  minutes  among  each  of  the  three  points  A,B,  and  C,  and  sixty 
minutes  between  the  points  C  and  D  respectively.  The  modified  timetable  is  thus 
infeasible  due  to  the  insufficient  long  distance  between  the  points  A  and  B. 

Figure  4  (b)  shows  the  rescheduled  timetable  obtained  using  the  smallest- 
first  principle.  The  rescheduling  starts  from  moving  point  B  ahead  in  time  to 
achieve  the  ten-minute  requirement  between  points  A  and  B.  The  movement  in 
turn  causes  another  conflict  between  points  B  and  C.  Point  C  is  thus  forced  to 
move.  However,  there  is  no  feasible  location  for  point  C  to  move  between  points 
B  and  D  since  point  D  is  fixed  by  user.  Therefore,  we  have  to  move  point  C 
one-hour  ahead  of  point  D.  The  maximum  delay  in  this  case  is  two  hours.  This 
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solution  is  not  optimal  since  a  better  solution  can  be  obtained  by  simply  moving 
point  B  twenty  minutes  ahead  in  time,  as  shown  in  figure  5. 


Fig.  5.  A  Smallest- Changes- Optimal  Solution 


ConsistenU Assignment-First  Principle  This  heuristic  suggests  to  instantiate  first 
those  variables  that  can  be  instantiated  with  its  time  value  in  the  original 
timetable.  We  call  these  variables  non- conflicting.  The  other  variables  are  con¬ 
flicting.  Therefore,  the  labeling  of  the  non-conflicting  variables  will  be  back¬ 
tracked  into  last.  Again  values  in  the  variable  domains  are  enumerated  in  as¬ 
cending  order.  The  idea  is  to  maintain  as  many  variables  with  its  original  value 
as  possible.  This  heuristic  direct  searching  towards  a  minimum-change  optimal 
solution.  Note  that  a  non-conflicting  variable  may  be  instantiated  with  a  value 
other  than  its  original  value  eventually  if  the  conflicting  variables  have  tried  all 
possible  combination  of  time  values  and  no  solution  is  found. 

For  efficiency  reason,  we  further  classify  the  non-conflicting  variables  into  two 
group.  The  first  group  contains  variables  that  share  journeys  with  one  of  conflict¬ 
ing  variables.  The  second  group  contain  the  rest  of  the  non-conflicting  variables. 
Our  heuristic  suggest  to  label  first  the  second  group  of  non-conflicting  variables. 
This  ordering  is  essential  since  the  labeling  of  variables  sharing  journeys  with 
conflicting  variables  have  a  higher  chance  of  being  backtracked  into. 

We  apply  this  heuristic  to  reschedule  the  infeasible  timetable  in  figure  4  (a). 
Recall  that  the  points  A  and  D  are  fixed  by  users.  We  classify  the  variables 
associated  with  point  B  and  point  C  as  conflicting  and  non-conflicting  respec¬ 
tively.  Thus  point  C  is  labeled  first  to  retain  its  original  position  in  the  map  and 
point  B  is  forced  to  move  until  it  reaches  the  location  ten-minute  ahead  point 
C.  In  this  specific  case,  the  minimum-change  optimal  solution  coincides  with  the 
minimum-delay  optimal  solution.  This  example  also  shows  that,  in  general,  the 
two  heuristics  give  different  first  solution. 

5  Prototype  Implementation 

In  order  to  demonstrate  the  feasibility  of  our  algorithms,  we  have  re-constructed 
and  enhanced  the  PRaCoSy  running  map  tool  prototype  [7]  with  rescheduling 
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capability.  The  prototype  consists  of  a  constraint-based  scheduler  and  a  user- 
interface.  The  former  is  implemented  in  C-f-f  with  ILOG  Solver  library  2.0  [4] 
while  the  latter  is  built  using  Microsoft  Visual  Basic  3.0. 

In  the  following,  an  overview  of  the  running  map  tool  is  presented.  We  then 
give  a  sample  session  of  our  tool  using  a  segment  (from  Nanjingxi  to  Shanghai) 
of  the  China  railway  which  amounts  to  3005  constraints  and  596  variables.  The 
rescheduled  timetables  generated  by  the  two  heuristics  are  explained.  We  con¬ 
clude  this  section  by  showing  two  examples  to  which  our  tool  fails  to  respond  in 
a  timely  manner. 

The  running  map  display  (figure  6)  consists  of  six  columns  (regions).  Column 
one  and  column  four  show  the  abbreviated  identifiers  of  stations.  Column  two 
shows  the  number  of  arrival  and  departure  lines  of  stations.  Column  three  is  the 
main  window  for  presenting  the  graphical  representation  of  a  timetable,  with 
time  and  locations  as  the  X  and  Y  axes  respectively.  Column  five  shows  the  cu¬ 
mulative  distance  from  the  first  station.  Column  six  shows  the  distance  between 
the  current  station  and  the  previous  station.  In  the  cases  where  a  timetable  is 
too  large  to  fit  into  the  main  window,  two  scrollbars  will  be  enabled. 


Fig.  6.  The  Running  Map  Tool 


A  user  can  click-and-drag  any  lines  to  modify  the  corresponding  train  visits 
on  the  map.  The  modified  timetable  will  be  validated  using  the  verification 
algorithm  when  the  “Check”  button  is  pressed.  If  it  is  infeasible,  a  warning 
window,  such  as  that  shown  in  figure  7,  will  pop  up  to  display  all  constraint 
violations.  At  this  point,  the  user  can  either  invoke  the  rescheduling  algorithm 
by  pressing  the  “Reschedule”  button,  correct  the  modifications  manually,  or 
restore  the  original  feasible  timetable. 

Figure  8  (a)  shows  a  segment  of  a  China  railway  timetable.  Due  to  an  acci- 
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(a)  Original  Feasible  Timetable  (b)  Modified  Infeasible  Timetable 


Fig.  8.  A  Comparison  of  Two  Heuristic 


dent,  we  have  to  delay  the  train  departuring  from  Zhenjiang  at  1:05  to  3:30,  yield¬ 
ing  the  map  shown  in  figure  8  (b) ,  The  user-modified  departure  time  (pointed 
by  an  arrow)  and  all  the  subsequent  train  visits  on  the  journey  (the  highlighted 
segment)  are  then  fixed  immediately  by  the  running  map  tool.  This  user  move¬ 
ment  incurs  seven  constraint  violations  between  the  modified  journey  and  its  left 
adjacent  journey.  We  reschedule  the  infeasible  timetable  with  our  two  heuristics. 
Both  of  them  succeed  in  generating  a  feasible  timetable  within  a  few  seconds. 
Figure  9  (a)  and  figure  9  (b)  show  the  rescheduled  timetable  generated  using  the 
smallest-first  principle  and  the  consistent- assignment-first  principle  respectively. 
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(a)  Smallest-First  Principle  (b)  Consistent-Assignment-First  Principle 


Fig.  9.  A  Comparison  of  Two  Heuristic  (cont.) 


Applying  the  smallest-first  principle,  we  process  all  visits  on  the  map  from 
the  left  (earliest)  to  the  right  (latest).  For  each  visit,  if  its  associated  arrival 
(or  departure)  time  does  not  violate  any  scheduling  constraints,  we  preserve 
the  current  value.  Otherwise,  we  move  the  time  ahead  as  little  as  possible  to 
eliminate  the  inconsistencies.  Thus  some  visits  on  a  journey  may  be  modified 
while  others  remain  unchanged.  This  explains  why,  visually,  a  journey  is  not  only 
shifted  right  horizontally,  but  can  also  be  “bent”  by  the  rescheduling  process. 
The  movement  propagates  in  the  above  fashion  from  left  to  right.  Whenever  no 
further  movements  are  possible,  backtracking  takes  place. 

Instead  of  massaging  several  journeys  to  produce  a  feasible  timetable,  the 
consistent-assignment-first  principle  suggests  to  modify  as  few  station  visits  as 
possible.  This  goal  can  be  well  approximated  by  first  locating  station  visits  that 
can  remain  unchanged  with  respect  to  the  user  modifications.  These  station  visits 
will  labeled  first.  The  conflicting  variables,  having  to  change  their  original  values, 
will  be  labeled  last.  Our  experiments  reveal  that,  in  many  cases,  this  heuristic 
produces  timetable  which  is  “almost  identical”  to  the  original  timetable.  As  seen 
in  figure  9  (b),  most  of  the  journeys  retain  their  original  locations.  Even  for  the 
right-shifted  journey,  its  shape  is  mostly  preserved. 

Experimental  results  confirm  that  rescheduling  can  usually  be  completed 
within  seconds.  This  is  not  always  the  case.  Figure  10  (a)  and  figure  10  (b) 
provide  two  such  examples.  The  infeasible  timetable  in  figure  10  (a)  can  be 
rescheduled  using  the  consistent-assignment-first  principle  in  a  few  seconds,  but 
the  smallest-first  principle  fails  to  return  an  answer  within  five  minutes.  Fig¬ 
ure  10  (b)  is  simply  a  non- repairable  timetable.  Neither  heuristics  can  return 
promptly  to  confirm  the  unsolvability  of  the  problem. 
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(a)  Infeasible  Timetable  1  (b)  Infeasible  Timetable  2 


Fig.  10.  Poor  Performance  Example 


6  Concluding  Remarks 

The  contribution  of  this  paper  is  three-fold.  First,  we  define  formally  train 
rescheduling  as  a  constraint  satisfaction  problem.  Two  algorithms  for  railway 
timetable  verifications  and  rescheduling  are  then  derived  based  on  a  propagation- 
based  constraint  solver.  We  define  two  optimality  criteria,  which  are  used  to 
measure  the  “quality”  of  the  rescheduled  timetable.  It  is  important  to  note  that 
the  optimality  criteria  are  defined  with  respect  to  the  original  timetable.  Sec¬ 
ond,  based  on  the  domain  knowledge  learned  from  domain  analysis,  we  propose 
two  heuristics  to  speed  up  and  direct  the  search  towards  minimum-delay  opti¬ 
mal  and  minimum-change  optimal  solutions  respectively.  The  feasibility  of  our 
proposed  algorithms  and  heuristics  are  confirmed  with  experimentation  using 
real-life  data.  Third,  we  have  re-constructed  and  enhanced  the  PRaCoSy  run¬ 
ning  map  tool  prototype. 

It  would  be  an  interesting  future  work  to  study  different  stochastic  methods 
for  train  rescheduling.  Work  is  also  in  progress  to  experiment  our  rescheduling 
method  on  larger-scale  real-life  railway  timetables. 
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Abstract.  There  has  been  considerable  research  interest  into  the  sol¬ 
ubility  phase  transition,  and  its  effect  on  search  cost  for  backtracking 
algorithms.  In  this  paper  we  show  that  a  similar  easy-hard-easy  pattern 
occurs  for  local  search,  with  search  cost  peaking  at  the  phase  transition. 
This  is  despite  problems  beyond  the  phase  transition  having  fewer  solu¬ 
tions,  which  intuitively  should  make  the  problems  harder  to  solve.  We 
examine  the  relationship  between  search  cost  and  number  of  solutions 
at  different  points  across  the  phase  transition,  for  three  different  local 
search  procedures,  across  two  problem  classes  (CSP  and  SAT).  Our  find¬ 
ings  show  that  there  is  a  significant  correlation,  which  changes  as  we 
move  through  the  phase  transition. 


Keywords:  computational  complexity,  constraint  satisfaction, 
propositional  satisfiability,  search 


1  Introduction 

Local  search  has  been  proposed  as  a  good  candidate  for  solving  the  “hard”  but 
soluble  problems  that  turn  up  at  the  phase  transition  in  solubility  for  satisfia¬ 
bility  and  constraint  satisfaction  problems.  The  position  of  such  a  phase  tran¬ 
sition  appears  to  be  strongly  determined  by  the  expected  number  of  solutions 
[21,  19,  7,  8].  Recent  theoretical  analysis  has  shown  that  large  variances  in  the 
number  of  solutions  can  occur  at  the  phase  transition  [11].  In  addition,  empirical 
analysis  has  shown  phase  transitions  can  occur  when  the  expected  number  of 
solutions  is  significantly  larger  than  1  [19].  These  results  may  be  important  for 
understanding  performance  of  local  search  procedures  as  they  might  be  expected 
to  be  strongly  influenced  by  the  number  of  solutions.  If  there  are  many  solutions, 
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local  search  may  stumble  on  one  easily.  On  the  other  hand,  local  search  may  also 
be  led  in  conflicting  directions  by  different  solutions. 

In  this  paper  we  show  that,  across  a  range  of  problem  classes  and  local  search 
procedures,  the  hardest  problems  occur  (as  for  complete  systematic  algorithms) 
at  the  phase  transition  in  solubility.  This  is  despite  the  fact  that  problems  be¬ 
yond  the  phase  transition  can  have  fewer  solutions.  Problem  difficulty  across  the 
phase  transition  is,  however,  affected  by  the  number  of  solutions.  We  identify  a 
correlation  between  number  of  solutions  and  problem  hardness  for  local  search. 
We  show  this  correlation  is  robust  across  problem  class  and  types  of  local  search 
procedure,  and  across  the  phase  transition.  The  number  of  solutions  is  not  the 
only  factor,  since  we  identify  significant  variation  in  problem  hardness  when 
this  is  held  constant.  These  results  are  likely  to  be  of  considerable  importance 
for  understanding  phase  transition  behaviour  in  local  search  procedures  and  for 
benchmarking  such  procedures. 

2  Background 

2.1  SAT 

Propositional  satisfiability  (or  SAT)  is  the  problem  of  deciding  if  there  is  an 
assignment  of  truth  values  for  the  variables  in  a  propositional  formula  that  makes 
the  formula  true  using  the  standard  interpretation  for  logical  connectives.  We 
will  consider  SAT  problems  in  conjunctive  normal  form  (CNF);  a  formula,  S 
in  CNF  is  a  conjunction  of  clauses,  where  a  clause  is  a  disjunction  of  literals, 
and  a  literal  is  a  negated  or  un- negated  variable.  In  ^-SAT  problems,  all  clauses 
contain  exactly  k  literals.  Both  SAT  and  ^-SAT  (for  k  >  S)  are  NP-complete 
[6].  As  is  usual  [14],  we  will  generate  random  k-SAT  problems  with  n  variables 
and  I  clauses,  by  picking  k  variables  out  of  the  n  possible  for  each  clause,  and 
then  negating  each  variable  with  probability  ~ . 

2.2  CSP 

A  constraint  satisfaction  problem  (CSP)  consists  of  a  set  of  n  variables  and  a 
set  of  constraints.  Each  variable  v  has  a  domain.  My  of  size  m^.  Each  i^-ary 
constraint  restricts  a  A;-tuple  of  variables,  (vi,...!;^)  and  specifies  a  subset  of 
Ml  X  ...  X  Mjfc,  each  element  of  which  are  values  that  these  variables  cannot 
simultaneously  take.  We  consider  here  binary  CSPs  in  which  constraints  are 
only  between  pairs  of  variables.  As  in  previous  studies  [19,  7],  we  will  generate 
binary  CSPs  with  n  variables  each  with  domain  m,  constraint  density  pi ,  and 
constraint  tightness  p2,  by  picking  exactly  pin{n  -  l)/2  out  of  the  n{n  ~  l)/2 
possible  binary  constraints  between  variables.  For  each  selected  constraint,  we 
disallow  exactly  p2rri^  of  the  possible  pairs  of  values  of  the  two  variables. 

2.3  Phase  Transitions 

Many  NP-complete  problems  like  satisfiability  and  constraint  satisfaction,  dis¬ 
play  a  rapid  transition  in  solubility  as  we  increase  the  constrainedness  of  random 
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problem  instances.  This  phase  transition  is  associated  with  problems  which  are 
typically  hard  to  solve  for  backtracking  procedures  [2].  Problems  that  are  under- 
constrained  tend  to  have  many  solutions.  It  is  usually  therefore  very  easy  to 
guess  one  of  the  solutions.  Problems  that  are  over-constrained  tend  not  to  have 
any  solutions.  As  there  are  many  constraints,  any  possible  solution  is  usually 
quickly  ruled  out.  At  an  intermediate  point,  problems  are  critically  constrained; 
out  of  a  random  sample  some  will  be  soluble  and  some  not,  and  it  is  usually 
hard  to  either  find  a  solution  or  to  prove  that  none  exists.  Many  investigations 
have  studied  phcise  transition  behaviour  in  backtracking  algorithms,  in  problems 
such  as  SAT  (e.g.  [14,  3]),  CSP  [7,  16,  19],  Hamiltonian  circuits  [2,  5],  and  the 
traveling  salesman  problem  (e.g.  [8]). 

A  uniform  treatment  of  phase  transitions  in  combinatorial  problems  has  re¬ 
cently  been  presented  in  [8],  formalising  the  notion  of  ‘constrainedness’.  Given 
an  ensemble  of  problems,  the  constrainedness  is  defined  by, 


^  — def  1 


log2(<  Sol  >) 


(1) 


where  <  Sol  >  is  the  expected  number  of  solutions  for  a  problem  in  the  ensemble, 
and  Af  is  the  number  of  bits  needed  to  write  down  a  solution,  i.e.  the  base  2 
logarithm  of  the  size  of  the  state  space,  k  lies  in  the  range  [0,oo).  If  «  -C  1 
problems  are  under-constrained  and  likely  to  soluble,  if  /c  ;>  1  problems  are 
over- constrained  and  likely  to  be  insoluble,  and  if  k  «  1  problems  are  critically 
constrained  and  may  be  soluble  or  insoluble. 

As  in  [8]  we  plot  most  of  our  results  against  the  constrainedness,  «  of  problem 
instances.  This  allows  phase  transitions  in  different  classes  such  as  SAT  and  CSP 
to  be  directly  compared.  Furthermore,  such  comparisons  are  directly  related  to 
the  number  of  solutions  since. 


log2(<  Sol  >)  =  A^(l  -  k)  (2) 

For  random  ^-SAT  problems,  J\f  —  n  and  «  =  —  log2  (1  —  2~*)//n  [8].  This 
is  a  constant  multiplied  by  the  familiar  parameter  l/n  [14].  For  the  familiar 
case  of  A;  =  3,  i.e.  3-SAT,  the  multiplier  is  log2(8/7)  =  0.192 . . .  Note  that  the 
prediction  of  a  phase  transition  at  «  =  1  is  equivalent  to  l/n  =  5.19...  The 
fact  that  the  actual  phase  transition  is  observed  at  l/n  «  4.3,  i.e.  k  ^  0.83,  is 
indicative  of  the  fact  that  in  3-SAT  the  expected  number  of  solutions  at  the  phase 
transition  grows  as  approximately  2^-^^”.  For  random  CSPs,  M  =  nlog2(m)  and 

«  =  17]- 

Despite  the  extensive  literature  of  phase  transitions  in  backtracking  search, 
there  has  been  little  analysis  of  phase  transitions  in  local  search.  This  is  perhaps 
because  phase  transition  behaviour  is  usually  associated  with  the  transition  from 
soluble  to  insoluble  problems,  and  local  search  procedures  can  only  solve  soluble 
problems.  It  might  therefore  appear  that  phase  transitions  will  not  be  observed 
with  local  search  procedures.  We  can,  however,  conduct  experiments  on  the 
soluble  phase  of  the  ensemble.  By  ‘soluble  phase’,  we  mean  those  problems  in  an 
ensemble  which  are  soluble,  no  matter  what  the  generation  parameters  are.  To 
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study  the  soluble  phase,  we  generated  problems  at  random  ss  described  above, 
and  then  used  a  complete  backtracking  algorithm  to  eliminate  insoluble  problems 
from  the  ensemble. 

2.4  Local  Search  Procedures 

Local  search  procedures  start  with  an  initial  assignment  of  values  to  the  vari¬ 
ables.  They  then  explore  their  “local  neighbourhood”  for  “better”  assignments. 
The  local  neighbourhood  usually  consists  of  those  assignments  where  the  value  of 
one  variable  is  changed.  A  “score”  function  is  applied  to  determine  which  neigh¬ 
bour  to  move  towards.  In  SAT,  we  use  the  number  of  satisfied  clauses.  In  CSP, 
we  use  the  number  of  satisfied  constraints.  Hill-climbing  is  used  in  the  GSAT 
and  min-conflicts  procedures  to  maximize  the  score.  Other  procedures  use  more 
complex  procedures  for  selecting  neighbours.  For  example,  the  MC-log  proce¬ 
dure  chooses  a  neighbour  probabilistically  according  to  the  relative  ranking  of 
the  scores.  Local  procedures  can,  of  course,  be  trapped  in  local  maxima.  Various 
techniques  have  been  developed  to  overcome  this.  For  example,  GSAT  simply 
restarts  from  a  new  point  in  the  state  space.  By  comparison,  procedures  like 
MC-log  and  simulated  annealing  allow  score-decreasing  moves  with  a  certain 
probability.  The  experiments  in  this  paper  use  three  local  search  procedures; 
GSAT,  a  CSP  analogue  of  GSAT  called  GCSP,  and  a  min-conflicts  algorithm 
for  CSPs  called  MC-LOG. 

GSAT  [18]  is  a  local  search  procedure  for  SAT  which  begins  with  a  random 
generated  initial  truth  assignment,  then  hill-climbs  by  reversing  or  “flipping”  the 
assignment  of  the  variable  which  increases  the  number  of  satisfied  clauses  the 
most.  After  a  fixed  number,  MaxFlips,  of  moves,  search  is  restarted  from  a  new 
random  truth  assignment.  Search  continues  until  we  find  a  model  or  we  have 
performed  a  fixed  number,  MaxTries,  of  restarts. 

GCSP  [20]  is  an  analogous  procedure  to  GSAT  for  CSPs.  It  begins  with  a 
random  generated  assignment  of  values  to  variables,  then  hill-climbs  by  finding 
a  new  variable-value  assignment  which  increases  the  number  of  satisfied  con¬ 
straints  the  most.  After  a  fixed  number,  MaxChanges,  of  moves  (exactly  analo¬ 
gous  to  MaxFlips  in  GSAT),  search  is  restarted  from  a  new  random  assignment. 
Search  continues  until  we  find  a  solution  or  we  have  performed  a  fixed  number, 
MaxTries,  of  restarts. 

MC-log  is  based  on  min-conflicts  hill-climbing  [13]  but  with  an  ability  to 
escape  local  maxima.  Unlike  GCSP,  MC-log  does  not  consider  all  variables,  but 
instead  selects  randomly  a  variable  in  conflict  with  some  constraint.  The  local 
neighbourhood  for  this  variable  consists  of  alternative  values  for  it.  Unlike  min- 
conflicts  hill-climbing,  MC-LOG  does  not  select  the  neighbour  which  minimizes 
the  number  of  conflicts,  but  ranks  all  neighbours  according  to  their  min-conflicts 
‘score’,  and  selects  one  probabilistically.  Changing  the  value  of  this  variable  is 
called  a  ‘repair’.  The  selection  function  is  logarithmic,  so  the  ‘best’  value  is 
chosen  most  often,  but  not  exclusively  as  with  min-conflicts^.  The  number  of 

^  More  precisely,  we  pick  the  zth  ranked  value  where  i  —  int(log2(l/r)/w;)  and  r  is  a 
random  number  in  [0,  l]  and  w  is  some  fixed  weighting. 


123 


conflicts  can  therefore  occasionally  increase,  enabling  the  procedure  to  escape 
from  local  maxima. 

3  Phase  Transitions  and  Local  Search 

We  first  investigate  the  performance  of  local  search  as  we  vary  the  number  of 
solutions  for  a  fixed  size  of  problem.  Naively,  one  might  think  that  problems 
will  get  monotonically  harder  as  we  decrease  the  number  of  solutions  since  we 
must  search  for  an  ever  smaller  number  of  needles  in  a  haystack.  However,  even 
though  all  the  problems  tested  are  soluble,  behaviour  is  affected  by  the  solubility 
phase  transition.  Indeed,  the  hardest  problems  for  local  search  seem  to  occur  at 
the  same  point  as  the  hardest  problems  for  complete  search,  namely  at  the  phase 
transition  in  solubility.  In  this  paper,  we  take  this  to  be  the  point  in  the  phase 
space  where  50%  of  problems  are  soluble  and  50%  insoluble.  This  point  is  often 
associated  with  the  hardest  mean  search  cost  for  backtracking  algorithms  [3]. 

3.1  MC-LOG 

In  Fig  1  (left),  we  present  results  for  MC-LOG  on  1000  soluble  CSPs  with  n  —  20, 
m  =  10,  and  pi  =  0.5,  and  p2  varying  from  0.32  to  0.42,  corresponding  to 
a  range  of  k  from  0.80  to  1.12.  We  plot  search  cost  (the  number  of  repairs) 
against  the  constrainedness,  k.  The  phase  transition  in  solubility  starts  at  /c  = 
0.89  where  99.1%  of  problems  generated  are  soluble®,  the  nearest  point  to  50% 
solubility  is  at  «  =  0.95  where  57.0%  are  soluble,  and  our  graphs  extend  to 
regions  where  very  few  problems  are  soluble.  At  /c  =  1.12,  only  1.2%  of  problems 
had  solutions.  The  peak  in  median  search  cost  is  at  «  —  0.99  while  the  peak  in 
mean  search  cost  is  slightly  earlier  at  «  =  0.95.  Surprisingly,  at  larger  values  of 
the  search  cost  decreases  even  though  the  average  number  of  solution  is  declining. 
As  with  complete  procedures,  the  peak  average  search  cost  is  associated  with 
the  solubility  phase  transition. 

Similar  results  are  obtained  if  we  study  CSPs  with  different  constraint  den¬ 
sities.  In  Fig  1  (right),  we  vary  p2  and  plot  median  search  cost  for  pi  =  0.30  to 
Pi  =  1.00,  i.e.  complete  constraint  graphs.  As  pi  increases  the  phase  transition 
occurs  at  smaller  values  of  P2.  In  all  cases  the  peak  median  search  cost  is  within 
0.01  of  the  value  of  p2  where  50%  of  problems  are  soluble.  For  comparison  of 
different  problem  classes,  we  also  plot  our  data  against  k  instead  of  p2,  as  in  Fig 
1.  As  the  constraint  density,  pi  increases  the  peak  search  cost  (and  solubility 
transition)  occurs  nearer  to  the  expected  value  of  /c  =  1.  As  pi  increases,  the 
mean  search  cost  at  the  phase  transition  increases,  and  by  (2),  the  expected 
number  of  solutions  decrecises.  This  suggests  a  correlation  between  the  average 
number  of  solutions  and  search  cost  at  the  phase  transition.  However  the  picture 
is  not  clear,  since  at  a  fixed  k  there  is  a  fixed  expected  number  of  solutions,  yet 
the  search  cost  varies  by  a  large  factor  for  different  pi . 

®  Reccill,  that  we  simply  discard  insoluble  problems. 
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We  draw  two  conclusions  from  this  data.  First,  we  observe  an  ‘easy-hard-easy’ 
pattern  of  problem  difficulty.  The  hard  region  is  associated  with  the  solubility 
phase  transition,  despite  the  fact  that  as  we  make  problems  more  constrained, 
they  have  fewer  solutions.  This  is  consistent  with  the  results  of  [14,  19]  for 
backtracking  algorithms  applied  to  soluble  problems.  Second,  there  is  some  cor¬ 
relation  between  peak  search  cost  and  expected  number  of  solutions  at  the  phase 
transition. 


Fig.  1.  (left)  Search  cost  for  MC-LOG  on  (20,10,0.5)  problems  as  p2  varies,  plotted 
against  kappa,  (right)  Median  search  cost  for  MC-log  on  (20,10,pi)  problems. 


3.2  GCSP 

To  determine  if  the  results  of  Section  3.1  are  specific  to  the  MC-LOG  procedure, 
we  re-ran  the  experiment  reported  in  Fig  1  (left)  using  the  local  search  procedure 
GCSP.  To  minimise  variation,  we  tested  GCSP  with  the  identical  problems  used 
with  MC-log.  We  set  MaxChanges  to  500  and  ran  until  problems  where  solved, 
i.e.  we  effectively  set  Max-Tries  to  infinity.'^  We  plot  total  changes  used  by  GCSP 
against  k  in  Fig  2  (left).  Total  changes  is  calculated  as  MaxChanges  times  the 
number  of  failed  tries,  plus  the  number  of  Changes  on  the  final  try,  and  gives  a 
measure  of  search  cost  for  GCSP. 

Behaviour  is  broadly  similar  to  that  seen  with  MC-LOG  even  though  these 
procedures  have  significant  differences.  GCSP  uses  restarts  instead  of  proba¬ 
bilistic  acceptance  to  avoid  local  maxima.  In  addition,  GCSP  makes  a  global 
choice  of  the  best  variable- value  assignment  rather  than  a  local  choice  of  value 
for  a  selected  variable. 

^  Two  of  the  authors  implemented  GCSP  independently  and  observed  very  similar 
results. 
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QCSP  on  <20,10.0.5>  problems  QSAT  on  Random  3-SAT  100  vars 


Fig.  2.  (left)  GCSP  phase  transition  behaviour  (right)  GSAT  phase  transition  be¬ 
haviour 


This  suggests  that  other  local  search  procedures  for  binary  CSPs  will  display 
an  easy-hard-easy  pattern  with  the  peak  in  search  cost  occurring  at  the  solubility 
phase  transition. 


3.3  GSAT 

It  is  likely  that  these  results  for  CSPs  will  apply  to  other  NP-complete  problem 
classes  like  SAT.  To  test  this  hypothesis,  we  performed  a  similar  experiment  with 
100  variable  random  3-SAT  problems.  We  varied  the  number  of  clauses  from  20 
to  420  in  steps  of  20,  and  from  420  to  460  in  steps  of  10.  To  aid  comparison  with 
CSPs  we  will  plot  our  results  against  k.  For  the  3-SAT  problems  studied  here, 
the  value  of  l/n  can  be  read  off  from  the  relation  l/n  5.19k:.  The  end  point 
corresponds  to  a  value  of  ac  0.89.  At  each  point  we  tested  GSAT  on  1000 
soluble  problems.  We  set  MaxFlips  to  342,  the  optimal  value  for  the  middle  of 
the  phase  transition  reported  by  [10].  As  a  measure  of  search  cost  we  use  the 
total  flips  used,  calculated  analogously  to  total  changes  in  GCSP.  The  phase 
transition  in  solubility  starts  at  380  clauses,  «  0.73  where  99.3%  of  problems 

generated  are  soluble.  The  point  nearest  50%  solubility  is  430  clauses,  k  ^  0.83, 
where  48.2%  of  problems  were  soluble.  At  the  end  point  of  our  experiment,  9.9% 
of  problems  were  soluble.  The  fact  that  k  is  smaller  at  the  phase  transition  than 
in  previous  graphs  indicates  that  there  are  more  solutions  at  this  point  than  for 
comparably  sized  CSPs. 

Fig  2  (right)  shows  different  percentiles  of  search  cost  plotted  against  /c.  The 
peak  in  median  is  at  the  next  point  beyond  the  solubility  transition,  and  indeed 
the  peak  in  the  90th  percentile  of  search  cost  is  at  430  clauses,  the  50%  solubility 
point.  Even  best  case  behaviour  seems  to  get  easier  as  we  increase  k. 

To  conclude,  for  both  random  CSPs  and  SAT  problems,  we  see  an  easy- 
hard-easy  pattern  in  the  cost  of  local  search  procedures  with  the  peak  in  search 
cost  occurring  at  the  phase  transition  in  solubility. 
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4  Search  Cost  and  Number  of  Solutions 

In  the  last  section,  we  suggested  that  naively  one  expects  problems  to  get  mono- 
tonically  harder  for  local  search  as  the  number  of  solutions  decreases.  To  deter¬ 
mine  how  true  this  is,  in  Fig  3  we  give  scatter  plots  of  search  effort  for  MC-LOG 
(logio (repairs)  to  find  a  solution)  against  logio(number  of  solutions).  Inevitably 
we  must  investigate  comparatively  small  problem  sizes  such  cis  n  =  20  as  other¬ 
wise  the  exhaustive  search  to  find  all  solutions  becomes  prohibitive.  Each  data 
point  reported  is  the  mean  of  10  runs  of  MC-LOG  to  solve  an  individual  problem. 

In  Fig  3  (left),  we  give  a  scatter  plot  for  all  problems  in  the  soluble  phase. 
At  large  numbers  of  solutions,  this  is  a  close  correlation  between  the  number  of 
solutions  and  the  search  cost,  and  the  spread  in  search  costs  is  relatively  small. 
The  overall  shape  of  Fig  3  (left)  suggests  a  linear  correlation  between  the  log 
number  of  solutions  and  log  search  cost.  We  performed  linear  regression  on  this 
data,  finding  a  best  fit  gradient  of -0.31  with  a  correlation  coefficient  r  of  -0.79. 
At  small  number  of  solutions,  however,  the  spread  is  huge,  up  to  nearly  3  orders 
of  magnitude.  This  suggests  that  search  cost  is  not  simply  a  function  of  the 
number  of  solutions. 

We  obtain  a  better  picture  of  behaviour  if  we  look  at  fixed  points  in  the  phase 
space.  In  Fig  3  (right),  we  give  a  scatter  plot  for  problems  with  a  low  constraint 
tightness  (p2  =  0.32),  in  Fig  4  (left),  problems  from  the  middle  of  the  phase 
transition  {p2  =  0.37),  and  in  Fig  4  (right),  problems  with  a  large  constraint 
tightness  {p2  =  0.42).  At  each  point,  there  is  less  overall  spread  in  the  search 
cost  for  a  given  number  of  solutions  than  seen  in  Fig  3  (left).  We  performed 
regression  analysis  at  each  point  separately,  and  the  resulting  lines  are  shown. 
At  p2  =  0.32  we  estimated  a  gradient  of  -0.36  with  r  =  -0.56,  at  p2  =  0.37,  we 
estimated  a  gradient  of  -0.61  with  r  =  -0.70,  and  at  p2  =  0.42,  we  estimated 
a  gradient  of  —0.29  with  r  =  —0.28.  Notice  that  the  gradient  is  steepest  at 
the  solubility  phase  transition.  Problems  with  a  low  constraint  tightness  have 
a  large  number  of  solutions  and  search  cost  is  relatively  uniform  and  small.  At 
the  phase  transition,  there  is  a  large  variation  in  the  number  of  solutions.  For 
problems  with  few  solutions  at  the  phase  transition,  search  cost  tends  to  be  large. 
Although  problems  with  a  larger  constraint  tightness  have  a  smaller  number  of 
solutions,  search  cost  for  problems  with  the  same  number  of  solutions  tends  to 
be  less  than  at  the  phase  transition.  As  seen  earlier,  the  overall  cost  is  greatest  at 
the  phase  transition.  Although  there  is  still  considerable  variability,  the  hardest 
problems  are  those  from  the  phase  transition  with  a  few  solutions. 

4.1  GCSP 

To  determine  if  these  results  are  specific  to  the  MC-LOG  procedure,  we  also  made 
scatter  plots  for  the  logarithm  of  search  cost  of  GCSP  against  the  logarithm  of 
the  number  of  solutions.  Each  data  point  is  the  cost  of  solving  an  individual 
problem  once.  Fig  5  (left)  gives  the  plot  for  the  middle  of  the  phase  transition. 
The  regression  line  has  gradient  —0.56  and  r  =  —0.40.  Results  are  very  similar  to 
MC-log.  Again  there  is  considerable  variability,  problems  with  more  solutions 


Fig.  4.  log^^Q (repairs)  for  MC-LOG  (y-axis)  plotted  against  (x-axis). 

(left)  p2  =  0.37  ,  (right)  p2  =  —0.42 

are  generally  much  easier  than  problems  with  very  few  solutions.  The  hardest 
problems  are  again  those  from  the  phase  transition  with  a  single  or  very  few 
solutions.  The  greater  noise  in  this  figure  than  seen  with  MC-log  is  probably 
due  to  only  reporting  a  single  rather  than  an  average  of  10  runs. 

4.2  GSAT 

Once  again,  we  wished  to  see  if  our  results  extend  beyond  CSPs  to  SAT.  To  test 
this,  we  randomly  generated  1000  soluble  3-SAT  problems  with  100  variables  and 
430  clauses,  the  point  nearest  the  solubility  phase  transition  as  reported  above. 
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Fig.  5.  logio(cosi)  (y-axis)  plotted  against  logio(5o/«tion5)  (x-axis)  (left)  GCSP  at 
P2  =  0.37  (right)  GSAT  at  ra  =  100,  I  —  430 


We  ran  GSAT  on  100  tries  and  used  this  to  give  an  estimate  of  search  cost  as 
simply  100  divided  by  the  number  of  successes  for  each  problem.®  We  also  found 
the  exact  number  of  solutions  that  each  problem  has.  The  relationship  between 
these  two  measures  is  shown  in  Fig  5  (right).  Again,  there  is  a  strong  tendency 
for  problems  with  more  solutions  to  be  easier,  although  again  there  is  a  lot  of 
noise.  We  modeled  this  by  a  regression  line  with  gradient  -0.44  and  r  =  -0.77, 
The  granularity  in  the  search  cost  in  the  figure  is  due  to  the  measure  we  used, 
and  the  use  of  this  measure  may  make  the  regression  less  accurate.  We  intend 
further  experiments  to  investigate  this. 

To  summarise,  we  have  shown  a  strong  if  noisy  correlation  between  search 
cost  and  number  of  solutions.  The  correlation  seems  to  be  linear  on  a  log-log 
scale.  The  dependency  between  search  cost  and  number  of  solutions  changes  as 
we  change  the  constrainedness  of  problems,  with  the  hardest  problems  being 
those  at  the  solubility  phase  transition  with  very  few  solutions. 


5  Regression  Analysis  Through  Phase  Transitions 

We  have  shown  that  the  correlation  between  search  cost  and  the  number  of 
solutions  varies  at  different  points  in  the  solubility  phase  transition.  We  now 
look  more  closely  at  this  variation. 

Fig  6  shows  the  regression  lines  for  MC-LOG  for  all  values  of  constraint 
tightness  p2  across  the  phase  transition.  There  is  a  distinct  pattern  in  these 
regression  lines;  at  low  constraint  tightness,  the  magnitude  of  the  gradient  is 
low,  but  increases  to  reach  a  peak  when  p2  =  0.36  (k  =  0.92),  which  is  at  the 
phase  transition.  The  gradient  then  decreases  as  constraint  tightness  increases. 

^  Thus  we  are  using  a  different  cost  measure  to  that  used  above  for  GSAT  in  Fig  2. 
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This  is  shown  more  clearly  in  Fig  7  (left),  which  shows  the  values  of  the  gradient 
against  k.  This  makes  clear  that  the  steepest  gradient  is  at  the  solubility  pha^e 
transition.  We  interpret  this  as  suggesting  that  the  number  of  solutions  has  most 
influence  on  search  cost  at  the  solubility  phase  transition. 

Fig  7  (right)  shows  how  the  regression  fit  gradient  changes  against  /c,  for 
Pi  =  0.30  to  Pi  =  1.00.  In  each  case,  the  minimum  gradient  (maximum  absolute 
gradient)  occurs  at  or  very  close  to  the  50%  solubility  point  on  the  phase  transi¬ 
tion.  Once  again,  this  suggests  that  the  number  of  solutions  has  most  influence 
on  search  cost  at  the  phase  transition. 


Fig.  6.  Regression  fits  for  log  search  effort  (x-axis)  vs  log  solutions  (y-axis),  as  p2 
changes  (20,  10,  0.50,  P2) 


5.1  GCSP 

We  again  continued  our  investigations  by  seeing  if  our  results  would  also  be 
seen  in  GCSP.  Accordingly,  we  plotted  the  regression  gradients  obtained  from 
analysis  of  experiments  on  problems  from  the  classes  (20,10,0.5)  with  varying  p^. 
The  results  are  seen  in  Fig  8  (left).  The  same  pattern  emerges  as  in  Fig  7  (left). 
The  steepest  gradient  is  at  exactly  the  same  point.  Some  noise  can  be  seen  in 
this  graph,  perhaps  due  as  we  noted  earlier  to  not  averaging  GCSP  results  over 
many  runs.  Nevertheless,  it  seems  that  for  a  range  of  CSP  local  search  methods, 
the  number  of  solutions  matters  most  at  the  phcise  transition. 

5.2  GSAT 

To  conclude  our  investigations  of  how  regression  gradients  changed  through  the 
phase  transitions,  we  analysed  our  data  for  GSAT  with  100  variables  from  410 
to  480  clauses.  Again  the  gradient  of  the  regression  between  number  of  solutions 
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Fig.  7.  Regression  fit  gradients  (y-axis)  plotted  against  k  (x-axis)  for  (left)  pi  =  0.50, 
and  (right)  for  pi  from  0.30  to  1,00. 


and  search  cost  changes  as  we  vary  the  constrainedness  of  problems.  Recalling 
that  the  solubility  phase  transition  is  at  430  clauses,  in  this  case  we  do  not  find 
the  steepest  gradient  at  this  point.  As  yet  we  are  unclear  as  to  the  reasons  for 
this  change  from  our  results  on  CSPs.  In  particular,  it  must  for  the  moment 
remain  open  whether  this  is  due  to  an  inherent  difference  between  20  variable 
CSP  instances  and  100  variable  3-SAT  instances,  or  to  some  more  mundane 
feature  such  as  the  differing  measures  of  search  costs  used.  One  significant  dif¬ 
ference  between  the  CSP  and  3-SAT  classes  we  have  studied  here  may  be  that 
the  solubility  transition  in  3-SAT  occurs  at  lower  values  of  /c,  i.e.  where  more 
solutions  occur  in  relation  to  problem  size. 

6  Comparison  with  Complete  Algorithms 

Systematic  backtracking  procedures  will  also  be  affected  by  the  number  of  so¬ 
lutions.  If  there  are  many  solutions,  then  it  may  not  to  be  hard  to  find  a  path 
that  leads  to  one.  If  there  are  few,  it  may  be  hard  to  find  a  path  that  ends  in 
a  solution.  In  Fig  9,  we  give  scatter  plots  for  the  search  effort  for  the  FC-CBJ- 
BZ  algorithm  (measured  in  consistency  checks)  against  the  number  of  solutions. 
We  give  plots  for  both  the  search  effort  to  find  the  first  solution  and  to  find 
all  solutions.  FC-CBJ-BZ  is  a  forward  checking  algorithm  with  conflict-directed 
backjumping  [15]  using  the  Brelaz  heuristic  for  dynamic  variable  ordering  [1]. 
This  is  currently  one  of  the  best  algorithms  for  CSPs. 

There  is  a  tendency  for  search  cost  to  find  the  first  solution  to  decrease  with 
the  number  of  solutions,  and  this  can  be  regressed  by  a  line  with  gradient  —0.18 
and  r  =  -0.21.  However  the  tendency  is  very  weak  and  even  noisier  than  the 
plots  seen  earlier  for  local  search.  For  finding  all  solutions,  there  seems  to  be  no 
tendency  for  increasing  numbers  of  solutions  to  reduce  cost.  This  is  not  surprising 


Fig.  8,  Regression  fit  gradients  (y-axis)  plotted  against  k  (x-axis).  (left)  GCSP  on 
(20,10}0*5)  problems,  (right)  GSAT  on  n  =  100  problems 

since  to  find  all  solutions,  all  parts  of  the  search  space  must  be  explored  in  full. 
It  seems  that  new  features  of  the  solution  space  will  have  to  be  investigated  to 
predict  the  search  cost  for  complete  algorithms. 


0  0.5  1  1.S  2  2.5  3  0  0.5  1  1.5  2  2.5  3 


Fig.  9.  logio(checA:s)  (y-axis)  for  FC-CBJ-BZ  against  logjQ(so/utio7is)  (x-axis).  (left) 
checks  until  first  solution  found  (right)  is  checks  to  find  aU  solutions 

7  Related  Work 

Phase  transitions  in  backtracking  algorithms  have  been  the  subject  of  enormous 
study  in  recent  years,  but  comparatively  little  attention  has  been  paid  to  the 
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behaviour  of  local  search  algorithms  as  the  position  in  the  phase  space  changes. 
Experimental  results  for  GENET  and  GSAT  applied  to  random  3-colouring  in¬ 
stances  are  reported  by  [4].  These  results  are  consistent  with  those  reported  in 
Section  3  above,  in  that  search  cost  reduces  beyond  the  phase  transition,  and 
further  suggest  that  the  results  reported  here  are  unlikely  to  be  dependent  on 
the  particular  algorithms  or  classes  studied. 

Gent  and  Walsh  showed  that  a  range  of  variants  of  GSAT  applied  to  non- 
random  problems  gave  good  performance  on  problems  with  many  solutions  such 
as  the  n-queens  problem,  and  poor  performance  on  problems  with  few  solutions, 
for  example  quasigroup  construction  problems  [10].  They  speculated  that  the 
number  of  solutions  in  relation  to  problem  size  would  be  critical  in  understanding 
local  search  cost.  In  this  paper  we  have  shown  that  this  speculation  is  confirmed 
when  random  problems  are  studied,  although  there  are  other  features  which 
must  be  taken  into  account  such  as  position  in  the  phase  space.  This  still  does 
not  yield  a  full  explanation  of  behaviour,  so  we  wish  to  research  the  topology  of 
local  search  in  more  depth,  following  studies  such  as  [9,  12]. 

In  this  paper  we  have  studied  three  local  search  algorithms  for  two  different 
problem  classes.  The  fact  that  we  see  similar  results  in  each  case  suggests  that 
our  results  may  well  apply  to  a  large  number  of  similar  algorithms.  However,  it 
remains  an  interesting  question  if  these  results  will  apply  to  algorithms  such  as 
WSAT  [17]  which  may  explore  the  search  space  in  different  ways. 

8  Conclusions 

We  have  investigated  in  depth  the  relationship  between  the  number  of  solutions 
and  search  cost  for  local  search  procedure.  Although  there  is  no  single  simple 
story  (for  example,  search  cost  is  inversely  proportional  to  the  solution  density), 
we  have  identified  some  important  connections.  In  particular,  the  hardest  prob¬ 
lems  tend  to  have  few  solutions  and  usually  occur  {as  with  complete,  systematic 
algorithms)  at  the  solubility  phctse  transition.  We  have  shown  that  there  is  a  sig¬ 
nificant  correlation  between  the  number  of  solutions  and  problem  hardness  for 
local  search.  This  correlation  is  robust  across  problem  class  and  types  of  local 
search  procedure,  and  across  the  phase  transition.  The  number  of  solutions  is, 
however,  not  the  only  factor  determining  problem  hardness  since  there  is  signif¬ 
icant  variation  in  problem  hardness  when  the  solution  density  is  held  constant. 
These  results  improve  our  understanding  of  phase  transition  behaviour  and  of 
the  factors  affecting  the  performance  of  local  search  methods. 
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Abstract.  In  this  paper  we  investigate  which  constraints  may  be  de¬ 
rived  from  a  given  set  of  constraints.  We  show  that,  given  a  set  of  rela¬ 
tions  72,,  all  of  which  are  invariant  imder  some  set  of  permutations  P,  it 
is  possible  to  derive  any  other  relation  which  is  invariant  imder  P,  using 
only  the  projection,  Cartesian  product,  and  selection  operators,  together 
with  the  effective  domain  of  72,,  provided  that  the  effective  domain  con¬ 
tains  at  least  three  elements.  Furthermore,  we  show  that  the  condition 
imposed  on  the  effective  domain  cannot  be  removed.  This  result  sharp¬ 
ens  an  earlier  result  of  Paredaens  [13],  in  that  the  union  operator  turns 
out  to  be  superfluous.  In  the  context  of  constraint  satisfaction  problems, 
this  result  shows  that  a  constraint  may  be  derived  from  a  given  set  of 
constraints  containing  the  binary  disequality  constraint  if  and  only  if  it 
is  closed  imder  the  same  permutations  as  the  given  set  of  constraints. 

Keywords:  Relational  database,  relational  algebra,  constraint  deriva¬ 
tion 


1  Introduction 

In  a  constraint  satisfaction  problem  [11,  12]  some  of  the  constraints  are  explicit, 
whilst  others  are  generally  present  only  as  implicit  or  derived  constraints.  In 
this  paper  we  investigate  which  constraints  it  is  possible  to  derive  using  a  given 
coDection  of  explicit  constraint  types. 

The  approach  taken  is  interdisciplinary,  drawing  on  the  results  of  relational 
database  theory.  The  close  relationship  between  the  theory  of  constraint  satis¬ 
faction  problems  and  relational  databases  has  been  pointed  out  by  several  au¬ 
thors  [2,  4,  6,  15].  In  this  paper  we  present  a  result  concerning  the  algebraic  prop¬ 
erties  of  relations  which  sharpens  an  earlier  result  of  Paredaens  [13]  in  database 
theory,  and  thereby  allows  an  application  to  constraint  satisfaction  problems. 

The  result  obtained  by  Paredaens  states  that  a  database  relation  can  be 
obtained  from  a  set  of  database  relations  using  a  relational  algebra  expression 
if  and  only  if  the  relation  is  invariant  under  all  permutations  of  the  database 
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domain  under  which  all  relations  of  the  given  set  are  invariant®. 

Paredaens  actually  proved  more  than  he  claimed:  he  showed  that  a  rela¬ 
tion  satisfying  his  invariancy  condition  can  be  derived  using  only  the  monotonic 
projection,  Cartesian  product,  equality  and  disequality  selection,  and  union  op¬ 
erators.  Thus,  the  non-monotonic  difference  operator  of  the  relational  algebra  is 
not  required. 

In  this  paper,  we  shall  show  that,  when  the  effective  domain  of  the  given 
set  of  relations  is  available,  we  can  also  dispense  with  the  union  operator  if 
this  effective  domain  contains  at  least  three  elements.  We  shall  give  both  a  non¬ 
constructive  and  a  constructive  proof  of  this  result,  and  we  shall  also  show  that 
the  union  operator  cannot  generally  be  dispensed  with  if  the  effective  domain 
contains  fewer  than  three  elements. 

In  those  cases  where  the  union  operator  has  been  shown  to  be  superfluous, 
the  result  may  be  applied  to  the  question  of  deriving  constraints  in  the  context 
of  constraint  satisfaction  problems.  It  demonstrates  that  a  constraint  (of  any 
arity)  may  be  “derived”  from  a  given  set  of  constraints  which  includes  the  binary 
disequality  constraint,  if  and  only  if  it  is  invariant  under  the  same  permutations 
as  that  set.  In  other  words,  if  we  are  given  a  set  of  constraints,  5,  which  contains 
the  binary  disequality  constraint,  then  we  may  obtain  any  other  constraint,  (7, 
as  a  projection  of  the  set  of  solutions  to  some  constraint  satisfaction  problem 
involving  constraints  from  5,  provided  that  C  is  invariant  under  any  relabelings 
of  the  domain  which  leave  all  the  constraints  in  S  unchanged. 

The  consequences  of  this  result  are  quite  striking.  It  implies,  for  example, 
that  any  finite  constraint  over  the  integers,  of  any  arity,  may  be  obtained  as  an 
implicit  constraint  in  a  constraint  network  containing  only  binary  constraints  of 
the  form  cc  /  t/,  and  x  =  y  1. 

Even  when  we  know  that  it  is  possible  to  derive  a  constraint  from  some  set 
of  given  constraints,  it  is  often  far  from  obvious  how  this  derivation  may  be 
carried  out.  For  example,  the  reader  is  invited  to  attempt  to  design  a  constraint 
network  over  positive  integers  less  than  100  which  contains  an  implicit  constraint 
of  the  form  a;^  -f  —  z^,  using  only  binary  constraints  of  the  form  x  ^  y 
and  X  =  y  +  1.  In  Section  5  we  describe  explicitly  how  to  construct  a  suitable 
constraint  satisfaction  problem  to  derive  any  given  constraint  which  belongs  to 
the  set  of  possible  derived  constraints. 

2  Definitions  and  terminology 

Relations  play  a  central  role  in  both  relational  databases  and  constraint  satis¬ 
faction  problems. 

Definition  1.  Let  D  be  a  set,  called  the  domain^  and  let  n  be  a  number.  An 
7i-ary  relation  over  the  domain  D  is  a  finite  subset  of  D^. 

^  Independently,  Bancilhon  [1]  proved  a  similar  result  for  the  equivalent  relational 
calculus.  This  property  subsequently  became  known  as  the  BP-completeness  of  the 
relational  algebra  and  calculus. 
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Following  a  common  practice  in  constraint  satisfaction  problems,  we  shall  refer 
to  the  elements  of  the  domain  D  as  colors, 

A  constraint  satisfaction  problem  [11,  12]  is  simply  a  hypergraph  in  which 
the  edges  are  ordered  sets  and  each  edge  is  labelled  with  a  relation  over  some 
fixed  domain  D.  These  relations  are  called  constraints.  A  solution  to  the  problem 
is  a  mapping  from  the  vertices  to  the  domain  such  that  the  image  of  the  vertices 
in  each  edge  of  the  hypergraph  is  an  element  of  the  corresponding  constraint.  In 
a  binary  constraint  satisfaction  problem  (often  called  a  constraint  network)  all 
of  the  constraints  are  relations  of  arity  2. 

In  the  context  of  constraint  satisfaction  problems,  the  domain  D  is  usually 
finite.  In  the  context  of  relational  databases,  the  domain  D  is  usually  infinite. 
In  order  to  overcome  this  apparent  mismatch  between  these  two  related  areas, 
we  make  use  of  the  following  notation,  which  is  weU-established  in  database 
theory  [14]. 

Definition  2.  Let  7^  be  a  set  of  relations  over  a  domain  D.  The  effective  domain 
of  7^,  denoted  Dtj:,  is  defined  to  be  the  smallest  subset  of  D  over  which  (as 
domain)  7^  is  a  set  of  relations. 

Note  that,  by  Definition  1,  D'ji  is  finite  whenever  72.  is  finite. 

In  relational  database  theory,  a  standard  set  of  operators  on  relations  is 
defined,  which  is  called  the  relational  algebra  [14]. 

Definition  3.  The  relational  algebra  is  a  set  of  operators  on  relations  consisting 
of  the  binary  operators  “union"  (U),  “difference’^  (— ),  and  “Cartesian  product” 
(x),  and  the  unary  operators  “selection”  (cr)  and  “projection”  (tt).  The  binary 
operators  are  defined  in  the  usual,  set-theoretic  way.  The  unary  operators  are 
defined  as  follows: 

—  Let  r  be  an  n-ary  relation  over  a  domain  D,  Let  1  <  i,  7  <  n.  The  equality 
selection  o-i=j(r)  is  defined  to  be  the  n-ary  relation 

=  0  e  r  I  t[i]  =  t[j]}. 

The  disequality  selection  ori^^jir)  is  defined  to  be  the  n-ary  relation 
o-i^j(r)  =  0  e  »•  I  <[t]  ^  f[i]}. 

—  Let  r  be  an  n-ary  relation  over  a  domain  D,  Let  ii, . . .,  be  a  subsequence 

of  1, . . . ,  n.  The  projection  is  defined  to  be  the  fc-ary  relation 

’fi. . <.(»■)  =  {(t[*i]>.--.t[4])  |<  er}. 

Using  the  selection  operator,  two  special  relations  over  a  domain  D  can  be  de¬ 
fined: 

1.  the  binary  equality  relation^  =D)  is  defined  by  <ri-2(D^);  and 

2.  the  binary  disequality  relation^  is  defined  by  <715,^2 (^^)- 
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We  now  define  a  much  more  restricted  algebra  which  contains  only  three  oper¬ 
ators. 

Definition  4.  The  SPJ-algebra^  consists  of  the  relational  algebra  operators 
Cartesian  product,  equality  selection,  and  projection. 

Note  that  intersection  (fl),  defined  in  the  usual  set-theoretic  way,  can  be  ex¬ 
pressed  in  the  SPJ-algebra,  by  using  a  sequence  of  Cartesian  product,  equality 
selection  and  projection  operations.  Similarly,  generalized  projection^  in  which 
the  indices  ii, . . . ,  4  are  only  required  to  be  in  the  range  l-n  (their  order  being 
arbitrary,  and  repetition  being  allowed)  can  also  be  expressed  in  the  SPJ-algebra. 

Most  importantly  of  all,  for  our  purposes,  the  join  operator  (IX),  [14],  can  be 
expressed  in  the  SPJ-algebra.  Now,  it  is  well-known  that  the  set  of  solutions  to 
a  constraint  satisfaction  problem  (expressed  as  a  relation)  can  be  obtained  by 
performing  a  join  operation  on  the  constraints  [2,  6].  The  possible  derived  con¬ 
straints  are  the  projections  of  these  sets  of  solutions,  as  the  following  definition 
indicates. 

Definition  5.  A  constraint  can  be  derived  from  a  set  of  relations  7^  if  it  is  equal 
to  some  projection  of  the  set  of  solutions  to  some  constraint  satisfaction  problem 
with  constraints  chosen  from  IZ. 

Lemma  6.  Let  71  be  a  set  of  constraints  over  a  domain  D  such  that  the  binary 
equality  relation,  =d>  '^o,y  derived  from 

A  constraint  C  can  be  derived  from  IZ  if  and  only  if  C  can  be  expressed  in 
the  SPJ-algebra  over  7Z. 

In  other  words,  whenever  the  binary  equality  relation  is  available  as  a  constraint, 
the  notion  of  a  derived  constraint  corresponds  precisely  to  the  notion  of  a  relation 
which  may  be  expressed  in  the  SPJ-algebra. 

We  shall  establish  in  Section  3  that  in  order  to  determine  which  constraints 
it  is  possible  to  derive  from  a  given  set  of  relations  7i,  it  is  helpful  to  consider 
arbitrary  operations  on  the  domain  D,  defined  as  follows. 

Definition  7.  Let  jD  be  a  domain  and  let  m  be  a  natural  number.  An  m-ary 
operation  on  D  is  a  total  mapping  from  to  D, 

Let  /  :  D'^  ->  D  be  an  m-ary  operation  on  D.  Then  f  can  be  extended  to 
relations  over  X>,  as  follows.  Let  r  be  an  n-ary  relation  over  D,  and  let  ti, ...  ,tm 
be  tuples  of  r  (not  necessarily  distinct).  We  define  /(ti, . .  .,tm)  to  be  the  n-ary 
tuple  (/(ti[l], . .  .,t^[l]), . . /(ti[n], . .  .,tmH))  and  define  f(r)  to  be  the  n-ary 
relation  {/(ti, . . . ,  ^m)  \  h, . . .  ,tm  ^  r}. 

Definitions.  Let  D  be  a  domain,  let  /  be  an  m-ary  operation  on  D,  and  let  r 
be  an  n-ary  relation  over  D.  The  relation  r  is  closed  under  f  if  /(r)  C  r. 

^  From  “Select-Project-Join.” 
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In  the  sequel,  we  shall  be  concerned  particularly  with  operations  which  only 
depend  on  a  single  argument.  We  therefore  define  the  following  properties  of  an 
operation. 

Definition  9.  Let  f  :  D  be  an  m-ary  operation  on  D.  The  operation  / 

is  called  essentially  unary  if  there  exists  1  <  i  <  m,  and  p  :  jD  ^  D,  a  unary 
operation  on  D,  such  that,  for  aU  di, . . . ,  dm  in  D,  /(di, . . . ,  d^)  rr  g{di).  If  g  is 
a  permutation,  then  /  is  said  to  be  essentially  a  permutation. 

Relations  which  are  closed  under  the  same  operations  have  many  common  prop¬ 
erties,  so  we  introduce  the  following  notation. 

Notation  1  Let  IZ,  be  a  set  of  relations  over  a  domain  D. 

—  The  set  TZ  is  the  set  of  all  relations  over  D'ji  which  are  closed  under  all 
permuta^ns  under  which  all  the  relations  of'JZ  are  closed. 

—  The  set  TZ  is  the  set  of  all  relations  over  D'ji  which  are  closed  under  all 
operations  under  which  all  the  relations  of'JZ  are  closed. 

3  Derivation  of  relations 

We  now  quote  two  results  from  the  literature  involving  derivation  of  relations, 
the  first  having  been  stated  in  the  context  of  constraint  satisfaction,  and  the 
second  in  the  context  of  relational  databases.  The  main  result  of  this  paper 
relies  on  combining  these  two  results.  In  order  to  state  the  results  concisely,  we 
introduce  the  following  notation. 

Notation  2  If  "JZ  is  a  set  of  relations  over  a  domain  D,  then 

—  IZ*  denotes  the  set  of  relations  over  D'ji  that  can  be  obtained  from  7Z  in  the 
relational  algebra. 

—  TZ^  denotes  the  set  of  relations  over  D'ji  that  can  be  obtained  from  IZ  in  the 
SPJ- algebra. 

Theorem  10  [8].  Let  TZ  be  a  set  of  relations  over  a  finite  domain  D  which 
contains  the  binary  equality  relation.  Then  IZ^  = 

We  note  here  that  the  inclusion  'JZ'^  D  IZ  is  proved  in  [8]  in  a  non-constructive 
way. 

Theorem  11  [13].  Let  IZ  be  a  finite  set  of  relations  over  a  domain  D.  Then 
1Z*  =7^. 

The  inclusion  TZ*  D  7^  is  proved  in  [13]  in  a  constructive  way.  In  preparation  for 
the  construction  to  be  given  later  in  this  paper,  we  briefly  sketch  the  construction 
in  [13]. 

Let  D  be  the  effective  domain  of  IZ,  and  let  ri  be  the  Cartesian  product  of  aU 
non-empty  relations  in  IZ.  Clearly,  ri  is  closed  under  exactly  those  permutations 
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under  which  each  relation  in  %  is  closed.  Let  ni  be  the  arity  of  ri,  and  let  si 
be  the  number  of  tuples  in  ri.  Consider  the  Cartesian  product  which  is 
an  5ini-ary  relation,  and  let  ti  be  a  tuple  of  containing  all  tuples  of  ri  as  a 
subtuple.  In  particular,  all  values  of  D  occur  in  ti.  Now,  for  aXi.  i,j  ~  1, . . . ,  5i7ii, 
^  perform  the  equality  selection  cri=j  if  ti[i]  =  <i[y]  and  the  disequality 

selection  if  ti[i]  ^  Then  choose  1  <  <  siTii  such  that 

D  ~  {ti[n], . . . ,  ti[2|£>|]},  and  perform  the  projection  Call  the  resulting 

relation  r2.  Let  <2  =  h[h,  ■  •  • ,  *|£>|]-  Now  it  is  easy  to  show  that  ri  is  closed  under 
a  permutation  /  :  D  — )■  D  if  and  only  if  there  exists  a  tuple  t  in  r2  such  that,  for 
all  i  —  1,  . . . ,  |jD|,  f(t2[i])  =  t[i].  This  statement  remains  true  if  <2  is  an  arbitrary 
tuple  of  r2.  Hence  IZ  =  {ri}  =  {r2} 

Now  let  r  be  any  relation  in  7^,  and  let  n  be  the  arity  of  r.  Let  t  be  an  arbitrary 
tuple  of  r.  Choose  1  <  ii, . .  - ,  jn  <  |L>|  such  that,  for  i  =  1, . . . ,  n,  t2[ji]  =  t[i]. 
Consider  the  generalized  projection  r*  —  We  have  {t}  C  C  r, 

the  latter  inclusion  because  r  is  closed  under  aU  permutations  under  which  r2  is 
closed.  Hence  rt  and  r  G  7^*. 

This  proof  sketch  clearly  shows  that  the  non-monotonic  difference  operator 
is  not  required  to  construct  any  relation  in  IZ.  This  construction,  however,  relies 
heavily  on  the  use  of  the  union  operator. 


4  Main  result 

In  this  section,  we  prove  our  main  result:  in  Theorem  11,  not  only  the  difference 
operator  but  also  the  union  operator  is  shown  to  be  superfluous,  provided 

1.  we  have  the  effective  domain  D'ji  at  our  disposal  as  a  unary  relation,  and 

2.  D'JI  contains  at  least  three  elements. 

The  proof  we  shall  give  in  this  section  is  non- constructive  and  relies  on  the 
following  lemma. 

Lemma  12.  Let  D  be  a  finite  domain  containing  at  least  three  colors  and  let 
f  :  —yDbean  m-ary  operation  on  D.  Then  the  binary  disequality  relation, 

^Df  is  closed  under  f  if  and  only  if  f  is  essentially  a  permutation. 

Proof.  Omitted.  See  [3]  for  details. 

We  draw  the  reader’s  attention  to  the  fact  that  the  condition  that  the  domain  D 
contain  at  least  three  colors  is  used  repeatedly  in  the  proof  of  this  result.  Without 
this  condition,  Lemma  12  does  not  hold.  To  see  this,  it  suffices  to  observe  that 
the  binary  disequality  relation  on  a  bi-valued  domain  D  is  closed  under  the  so- 
caUed  majority  operation  [10],  which  is  the  ternary  operation  on  D  returning, 
on  each  triple  of  colors  in  D,  the  unique  color  in  that  tuple  occurring  more  than 
once.  Clearly,  the  majority  operation  is  not  essentially  unary. 

We  now  prove  the  main  result. 
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Theorem  13*  Let  D  be  a  finite  domain  and  let  71  be  a  finite  set  of  relations 
over  D,  containing  the  binary  disequality  relation  If  D  contains  at  least 

three  colors,  then  TV  —  . 

Proof.  Since  TZ  contains  the  relation  ,  a  simple  derivation  shows  that  TZ^  con¬ 
tains  the  equality  relation,  =£>.  Hence,  by  Theorem  10,  TZ'^  =  TZ.  By  Lemma  12, 
all  the  operations  under  which  the  relation  ^TZis  closed  are  essentially  per¬ 
mutations,  so  TZ  —  TZ.  By  Theorem  11,7^  =  TZ* . 

The  theorem  fails  without  the  condition  that  the  domain  contain  at  least  three 
colors.  To  see  this,  let  D  he  a.  domain  containing  just  two  colors  and  let  TZ  be  the 
singleton  set  consisting  of  the  binary  relation  ^d-  When  |D|  =  2,  the  relation 
is  closed  under  the  majority  operation  on  D,  as  described  above.  Since 
closure  is  preserved  by  the  Cartesian  product,  equality  selection  and  projection 
operations,  it  follows  that  every  relation  in  TZ'^  is  also  closed  under  the  majority 
operation  on  D.  Now  consider  the  relation  Rq  =  o-2:^s(D^)Ucri:^z{D^)h^^ij^2{L)^), 
which  is  the  complement  of  the  ternary  equality  relation.  This  relation  clearly 
belongs  to  TZ* ,  but  it  is  not  closed  under  the  majority  operation  on  D  and  hence 
does  not  belong  to  7^"*". 

We  now  restate  Theorem  13  in  terms  of  relational  database  theory. 

Corollary  14*  Let  TZ  be  a  finite  set  of  relations  over  D.  If  TZ  contains  the  ef¬ 
fective  domain  D'ji  as  a  unary  relation,  and  {D'jil  >  3;  then  any  relation  in  TZ* 
can  be  obtained  without  using  the  union  or  difference  operators. 

Note  that  the  condition  that  G  7^  is  no  longer  required,  since  the  relational 
algebra  includes  the  disequality  selection  operator. 


5  Constructive  proof  of  the  main  result 

The  proof  of  Theorem  13  exhibited  in  Section  4  is  not  entirely  satisfactory,  be¬ 
cause  it  is  non-constructive.  Unlike  the  proof  of  Theorem  11,  it  does  not  provide 
a  clue  as  to  the  size  of  the  expression  required  to  derive  a  relation,  or,  in  terms 
of  constraint  satisfaction  problems,  the  size  of  the  network  required  to  derive  a 
constraint.  In  this  section,  we  provide  a  constructive  proof  of  Theorem  13. 

What  we  have  to  show  is  how  any  relation  in  TZ*  can  be  obtained  from  TZ 
in  the  SPJ-algebra.  The  proof  of  Theorem  11  in  [13],  which  was  outlined  briefly 
above,  indicates  how  to  construct  an  expression  which  generates  any  given  rela¬ 
tion  in  TZ*  using  projection,  selection,  Cartesian  product  and  union  operators. 
Because  of  the  presence  of  the  binary  disequality  relation  disequality  selec¬ 
tion  can  easily  be  simulated  in  the  SPJ-algebra.  Thus  the  only  operator  which 
presents  any  difficulty  is  the  union  operator. 

Since  the  union  of  two  n-ary  relations  is  equal  to  the  complement  (with 
respect  to  D^)  of  the  intersections  of  their  complements,  it  will  be  sufficient  to 
construct  an  expression  which  yields  the  complement  of  a  given  relation.  This  is 
the  purpose  of  the  remainder  of  this  proof. 
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Fig.  1.  The  construction  of  a  witness  y  for  the  variables  vi  and  V2  in  the  case  that 
\D\  —  4.  All  edges  are  labeled  with  the  binary  disequality  constraint 


We  shall  proceed  in  steps.  A  construction  which  will  re-occur  in  several  of  the 
steps  is  the  following.  Given  variables  vi  and  uj,  and  a  domain  D  with  \D\  >  3, 
construct  a  constraint  network  as  follows.  First,  add  a  complete  graph  on  |Z)|  —  1 
new  variables  a;ju|-i.  Then,  connect  Vi  to  each  of  jci, . . .,  a;j£)|_2  (but  not 

to  a;[£>|_i)  and  V2  to  each  of  ®2, . .  • ,  ®[d|-i  (but  not  to  x\).  Next,  add  a  variable  y 
and  connect  y  to  each  of  «i, . . .,  Finally,  label  each  edge  with  the  binary 

disequality  constraint  (Figure  1  illustrates  this  construction  for  \D\  =  4.) 
The  constraint  satisfaction  problem  corresponding  to  the  constructed  network 
has  the  following  properties: 

(z)  whenever  vi  and  V2  are  assigned  the  same  color  in  a  solution,  y  must  also  be 
assigned  that  color;  and 

(n)  whenever  v\  and  V2  are  assigned  different  colors,  every  assignment  of  color 
to  y  is  further  extendible  to  a  solution. 

We  shall  call  y  a  witness  of  vi  and  V2.  (Observe  that  the  construction  fails  for 

\D\<S.) 

More  generally,  we  define  the  following  notion. 

Definition  15.  Let  D  be  a  finite  domain.  In  a  constraint  network,  a  variable  y 
is  called  a  witness  for  the  variables  ui, . .  .,u^  if  the  following  properties  hold: 

(?)  whenever  aU  of  vi, . . . ,  are  assigned  the  same  color  in  a  solution,  y  must 
also  be  assigned  that  color;  and 

(n)  whenever  not  all  of  ui, . . . ,  are  assigned  the  same  color,  every  assignment 
of  color  to  2/  is  further  extendible  to  a  solution. 

The  first  step  of  our  construction  is  contained  in  the  following  lemma. 
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vi  y3 


Fig,  2,  The  recursive  step  in  the  constructions  in  the  proofs  of  both  Lemma  16  and  17 
for  in  I  =  4  and  n  =  4.  All  binary  edges  are  labeled  with  the  binary  disequality 
constraint 


Lemma  16*  Let  D  be  a  finite  domain  containing  at  least  three  colors  and  let  n 
be  a  number,  n  >  1.  Let  vi,...,Vn  be  distinct  variables.  A  constraint  network 
containing  vi,  ...  ,Vn  in  which  some  variable  y  is  a  witness  for  vi, ...  ,Vn  can  be 
effectively  constructed. 

Proof.  If  n  =  1,  the  isolated  node  vi  is  the  desired  network,  since  an  individual 
variable  is  always  its  own  witness. 

For  larger  values  of  n,  we  proceed  recursively.  First,  construct  a  constraint 
network  containing,  besides  the  variables  ui, . . . ,  t;„,  variables  yi, . . . ,  yn-i  such 
that,  for  z  =  1, . . . ,  71  —  1,  2/4  is  a  witness  of  Vi  and  (Figure  2  illustrates 

this  construction  for  \D\  =  4:  and  n  =  4.)  The  constraint  satisfaction  problem 
corresponding  to  the  constructed  network  has  the  following  properties,  which  can 
readily  be  deduced  from  the  analogous  properties  of  the  witness  construction  for 
n  =  2: 

(z)  whenever  all  of  ui, ..  .,u„  are  assigned  the  same  color  in  a  solution,  all  of 
7/1, ,  yn-i  must  also  be  assigned  that  color;  and 
(a)  whenever  not  all  of  vi,...,Vn  are  assigned  the  same  color,  there  exists  i, 
1  <  i  <  n  —  1  such  that  every  assignment  of  color  to  yi  is  further  extendible 
to  a  solution. 

The  above  properties  guarantee  that,  if  we  add  a  constraint  network  in  which  y 
is  a  witness  for  2/I1  •  •  •  >  2/n-i  to  the  one  constructed  above,  y  will  be  a  witness 
for  Vi,...,Vn* 

The  next  step  of  the  construction  is  contained  in  Lemma  17,  which  has  a  very 
similar  proof  to  Lemma  16. 
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Lemma  1 7.  Let  D  be  a  finite  domain  containing  at  least  three  colors  and  let  n 
be  a  number,  n  >  2.  Let  vi,.,.,Vn  be  distinct  variables.  A  constraint  network 
containing  vi, . . Vn,  in  which  the  constraint  induced  onV  —  {vi, . . . ,  Vn}  is  the 
complement  of  the  n-ary  equality  relation  over  D,  can  be  effectively  constructed. 

Proof.  Of  course,  the  complement  of  the  binary  equality  relation  is  the  binary 
disequality  relation.  Thus,  if  ti  2,  the  network  consisting  of  the  nodes  vi  and  V2 
linked  by  an  edge  labeled  with  the  binary  disequality  relation  is  the  desired 
network. 

For  larger  values  of  n,  we  proceed  recursively.  First,  as  in  the  proof  of 
Lemma  16,  construct  a  constraint  network  containing,  besides  the  variables  vi, 

. . . ,  Vn,  variables  yi, . . . ,  Pn-i  such  that,  for  i  =  1, . . . ,  7i  —  1,  yi  is  a  witness 
of  Vi  and  (Figure  2  illustrates  this  construction  for  |il|  =  4  and  n  =  4.) 

Then,  add  a  constraint  network  which  includes  the  variables  in  the  set  Y  — 
{yi, . .  .,yn-i}  and  induces  the  constraint  on  these  variables  which  is  the  com¬ 
plement  of  the  n  —  1-ary  equality  relation.  By  properties  (i)  and  (ii)  of  the  pre¬ 
vious  construction,  described  in  the  proof  of  Lemma  16,  the  constraint  induced 
on  1^  =  {ui, . . . ,  Vn}  is  the  complement  of  the  n-ary  equality  relation. 

The  n-ary  equality  relation  imposed  on  the  variables  ui, . . . ,  is  a  special  case  of 
a  constraint  of  the  form  either  ”  or  We 

next  show  that  the  complements  of  all  constraints  of  this  form  can  be  computed 
in  the  SPJ-algebra  from  the  binary  equality  and  disequality  constraints. 

Lemma  18*  Let  D  be  a  finite  domain  containing  at  least  three  colors  and  let  n  be 
a  number,  n  >  1.  Let  Vi, . .  .,Vn  be  distinct  variables.  Let  r  be  any  n-ary  relation 
over  D  on  vi, ...  ,Vn  of  the  form  Ai<ij^j<n  "  either  "  or 

A  constraint  network  containing  ui, . . . ,  Vn,  in  which  the  constraint  induced 
on  V  =  is  the  complement  ofr,  can  be  effectively  constructed. 

Proof.  First,  verify  whether,  for  all  i,  j.  A;,  I  <  i  <  j  <  k  <  n,  “Ait”  equals 
each  time  “Ai^”  and  “Ajfe”  both  equal  If  this  is  not  the  case,  r  is 

the  empty  relation,  whence  its  complement  is  for  which,  of  course,  a  con¬ 
straint  network  can  be  constructed.  Otherwise,  let  {Vi, . . . ,  Kn}  be  the  partition 
induced  on  V  by  the  equivalence  relation  =  defined  by  vi  =  vj  if  and  only  if 
“Aij”  equals  If  m  =  1,  then  r  is  the  n-ary  equality  relation,  whence  the 
lemma  holds  by  Lemma  17.  Thus  suppose  m  >  1.  Construct  a  constraint  network 
containing,  besides  the  variables  ui, . . . ,  variables  t/i, . . . ,  y^  such  that,  for 
I  =  1, . . . ,  m,  is  a  witness  of  Vi.  (Such  a  network  can  be  effectively  constructed 
by  Lemma  16.)  Add  to  the  network  a  complete  graph  with  |£)|  —  m  -|-  1  vari¬ 
ables,  aji, . . . ,  3.nd  label  each  of  its  edges  with  the  binary  disequality 

constraint  Finally,  connect  each  of  yi, . . .,  y^  to  each  of 
also  by  an  edge  labeled  with  the  binary  disequality  constraint  (Figure  3 

illustrates  this  construction  for  |Z)|  =  5  and  m  =  3.) 

We  now  prove  that  the  constraint  induced  on  V  by  the  constructed  network 
is  indeed  the  complement  of  r.  First,  suppose  that  a  certain  assignment  of  colors 
to  vi,...,Vn  satisfies  the  constraint  imposed  by  the  complement  of  r  on  V. 
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Fig.  3.  Construction  of  the  constraint  network  in  the  proof  of  Lemma  18  for  |£?|  =  5 
and  m  =  3.  The  heavily  shaded  edge  is  V ;  the  lightly  shaded  areas  represent  the 
sub-networks  required  to  construct  witnesses  for  the  classes  into  which  V  is  partitioned. 


Hence  there  exist  1  <  i  <  j  <  n,  such  that  Vi  Aij  vj  is  not  satisfied. 
We  distinguish  two  cases.  If  “Aij”  equals  then  let  Vi,  1  <  /  <  m,  be  the 
class  of  the  partition  of  V  to  which  Vi  and  vj  both  belong.  Since  Vi  and  Vj  are 
assigned  different  colors,  the  witness  point  yi  can  be  assigned  any  color.  Now, 
extend  the  color  assignment  to  , . . . ,  validly  to  t/i, . . . ,  yi^i , . , . , 

(If  choices  are  possible,  make  them  arbitrarily.)  Then,  assign  to  yi  any  color 
used  to  color  the  other  witness  points.  If,  on  the  other  hand,  “Aij”  equals 
then  let  Vi^  and  V/j,  1  <  /i,^2  <  be  the  distinct  classes  of  the  partition  of 
V  to  which  Vi  respectively  vj  belong.  Since  Vi  and  vj  are  assigned  the  same 
color,  that  color  can  also  be  used  to  color  both  2/ii  and  yi^.  Now,  extend  the 
color  assignment  validly  to  the  remaining  witness  points.  (Again,  if  choices  are 
possible,  make  them  arbitrarily.)  In  both  cases,  at  most  m  —  1  distinct  colors 
have  been  used  to  color  the  witness  points.  Hence,  at  least  |jD|  —  m  -f  1  distinct 
colors  remain  to  color  . . . ,  and  the  original  assignment  of  colors  to 

vi, . . .  ^  Vn  has  been  extended  to  a  solution  of  the  constraint  satisfaction  problem 
on  the  network.  Conversely,  assume  the  constraint  satisfaction  problem  on  the 
constructed  network  has  a  solution.  Since  \D\  —  m-\-l  distinct  colors  are  required 
to  color  Xi, . . . ,  X|D|_rn+i7  ^he  points  yi, . . . ,  are  colored  with  at  most  m  —  1 
distinct  colors.  Thus  at  least  two  of  these  variables,  say  yi^  and  yi^jl  <h^h  < 
are  assigned  the  same  color.  We  distinguish  two  cases.  If  there  exists  1  <  /  <  m, 
and  Vi  and  Uj  in  V/,  1  <  «  <  i  <  n,  such  that  Vi  and  vj  are  assigned  different 
colors,  then  this  color  assignment  violates  the  conjunct  Vi  =  vj  in  r,  whence 
the  assignment  to  ui, . . . ,  satisfies  the  constraint  imposed  by  the  complement 
of  r  on  V.  If,  on  the  other  hand,  for  all  /,  1  <  /  <  m,  all  variables  in  Vi  are 
assigned  the  same  color,  then  they  must  have  been  assigned  the  same  color  as 
yi.  Hence,  for  all  Vi  in  V/j  and  Vj  in  1  <  <  n,  the  color  assignment 
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to  Vi  and  Vj  violates  the  conjunct  Vi  Vj  or  vj  ^  in  r  (whichever  of  the 
two  occurs),  whence  also  in  this  case  the  assignment  to  vi,. .  satisfies  the 
constraint  imposed  by  the  complement  of  r  on  V , 

Observe  that  the  construction  for  the  general  case  in  the  proof  of  Lemma  18 
does  not  work  for  the  case  m  =  1,  which  is  why  this  case  was  treated  separately. 

For  the  last-but-one  step  in  the  construction,  we  introduce  the  following 
notation. 

Notation  3  Let  D  be  a  finite  domain  and  let  r  be  an  n-ary  relation  over  D. 
Then  r  denotes  the  2n-ary  relation  over  D  consisting  of  all  2n-ary  tuples  t  over 
D  for  which  (t[l],  •  •  is  in  r  and  for  which  there  exists  i,  1  <  i  <  n,  such 

that  t[i]  f=-  t\n  +  i] . 

Lemma  19.  Let  D  be  a  finite  domain  containing  at  least  three  colors  and  let  n 
be  a  number,  n  >  1.  Let  i;i, . . . ,  ..  .,Wn  be  distinct  variables.  Let  r  be  an 

n-ary  relation  over  D  on  ui, . . . ,  of  the  form  Ai<i^j<n  ^3  >  ” 

either  or  A  constraint  network  containing  vi, . . .  ,Vn^wi, . . .  ^Wn,  in 

which  the  constraint  induced  onW  =  {^i, . . . ,  •  •  •  ?  '^n}  “Is  r,  can  be  effec¬ 

tively  constructed. 

Proof,  First,  construct  a  constraint  network  containing,  besides  the  variables 
Vi,. .  .,Vn:  Wi, . . . ,  Wny  Variables  t/i, . . . ,  such  that,  for  i  =  1, . . . ,  n,  yi  is 
a  witness  for  Vi  and  Wi,  (Such  a  network  can  be  effectively  constructed  by 
Lemma  16.)  Finally,  augment  the  constraint  network  by  adding  an  n-ary  edge 

V  =  labeled  with  the  relation  r  (which  is  induced  by  a  com¬ 

plete  graph  on  vi,...,Vn  the  edges  of  which  are  appropriately  labeled  with 
the  binary  equality  and  disequality  relations  —j)  and  ^d)  n-ary  edge 

Y  =  {yi,. .  .,yn}  labeled  with  the  complement  of  r  (which  is  induced  by  some 
constraint  network  by  Lemma  18).  (Figure  4  illustrates  this  construction  for 
n  ==  4.) 

We  now  prove  that  the  constraint  induced  on  W  by  the  constructed  net¬ 
work  is  indeed  f.  First,  suppose  that  a  certain  assignment  of  colors  to  vi, . . . ,  Vn, 
wi^. .  .,Wn  satisfies  the  constraint  imposed  by  the  relation  f  on  W,  Obviously, 
the  assignment  of  colors  to  vi, . . .  iVn  satisfies  the  constraint  imposed  by  the 
relation  r  on  F.  By  assumption,  there  exists  i,  I  <  i  <  n,  such  that  Vi  and 
Wi  are  assigned  different  colors.  Hence,  the  witness  point  yi  can  be  assigned 
any  color.  Now,  extend  the  color  assignment  to  vi, . . . ,  u>i, . .  • ,  Wn  validly  to 
yi, . . . ,  yi_i,  yi+i,  •  ■  -  lym-  (If  choices  are  possible,  make  them  arbitrarily.)  Fi¬ 
nally,  assign  to  yi  any  color  such  that  the  color  assignment  to  yi, . . . ,  y^  satisfies 
the  constraint  imposed  by  the  complement  of  r  on  Y.  (For  instance,  if  r  satisfies 
the  constraint  Vi  —  Vj,  then  assign  to  yi  another  color  than  to  yj]  if,  on  the 
other  hand,  r  satisfies  the  constraint  Vi  Vj,  then  assign  to  yi  the  same  color 
as  to  yj.)  Then  the  original  assignment  of  colors  to  vi, . . .  ^v^wi, . . .  ,Wn  has 
been  extended  to  a  solution  of  the  constraint  satisfaction  problem  on  the  whole 
network.  Conversely,  consider  a  solution  of  the  constraint  satisfaction  problem 
on  the  constructed  network.  By  definition,  the  assignment  of  colors  to  ni, . . . ,  t;n 
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Fig.  4.  Construction  of  the  constraint  network  in  the  proof  of  Lemma  19  for  n  =  4. 
The  lightly  shaded  areas  represent  the  sub-networks  required  to  construct  witnesses 
for  Vi  and  toi,  V2  and  W2i  Vz  and  wzy  and  and  W4. 

satisfies  the  constraint  imposed  by  the  relation  r  on  V.  Similarly,  the  assignment 
of  colors  to  2/1 , . . . ,  satisfies  the  constraint  imposed  by  the  complement  of  r 
on  Y .  Hence  there  exists  hj,l<i^j<  n,  such  that  yi  and  yj  are  assigned 
different  colors  whereas  Vi  and  vj  are  assigned  the  same  color,  or  vice-versa.  In 
both  cases,  the  assumption  that  Wi  is  assigned  the  same  color  as  Vi  (and  hence 
the  same  color  as  y^)  and  wj  is  assigned  the  same  color  as  vj  (and  hence  the  same 
color  as  yj)  readily  leads  to  a  contradiction.  So,  either  vi  and  Wi  or  Vj  and  Wj 
have  different  colors,  whence  the  assignment  of  colors  to  vi, . . . ,  t;„,  mi, . . . , 
satisfies  the  constraint  imposed  by  f  on  W. 

We  are  now  ready  to  make  the  last  step  in  the  construction. 

Theorem  20.  Let  D  be  a  finite  domain  containing  at  least  three  colors  and  let 
n  be  a  number,  n  >  1.  Let  vi,...,Vn  be  distinct  variables.  Let  r  be  any  n-ary 
relation  on  vi, . . .  ,Vn.  A  constraint  network  containing  vi, ...  ,Vn  in  which  the 
constraint  induced  onV  =  {r^i, . . . ,  t;^}  is  the  complement  of  r  can  be  effectively 
constructed. 

Proof.  We  re-use  a  technique  applied  in  the  proof  of  Theorem  11.  Let  5  be  the 
number  of  tuples  in  r.  Consider  the  Cartesian  product  r* ,  which  is  an  STi-ary 
relation,  and  let  t  be  a  tuple  of  r*  containing  aU  tuples  of  r  as  a  subtuple.  Now, 
for  all  i,j  -  I, . .  .,sn,  i  ^  j,  perform  the  equality  selection  (Ti~j  if  t[i]  =  t[j] 
and  the  disequality  selection  if  t[i]  7^  i[j],  and  let  r  be  the  final  result.  Each 
tuple  in  the  sn-ary  relation  f  over  D  is  a  concatenation  of  tuples  of  r  into  a 
single  STi-ary  tuple.  Not  all  s!  possible  concatenations  need  occur,  however. 

We  start  the  actual  construction  by  a  constraint  network  containing,  besides 
the  variables  vi, . . . ,  variables  mi, ... ,  m,„,  and  single  edge  W  =  {mi, . . . ,  m,^} 
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Fig,  5.  Construction  of  the  constraint  network  in  the  proof  of  Theorem  20  for  n  =  3 
and  5  =  3. 


which  is  labeled  with  r.  Next,  for  A:  ::z:  1, . . . ,  5,  let  be  the  n-ary  relation  over 
D  defined  on  W(^k~i)n+u  •  ^  by  Ai<t5£,<n  with 

“Ai/’  being  iftl(k-l)n-\-i]=t[{k-l)n-\-jl  and  V”  iftp-l)n  +  z]  7^ 
t[{k  -  l)n  +  j].  By  construction,  t  above  may  be  replaced  by  any  tuple  in  f 
without  altering  the  definitions  of  rx,...,r,.  We  complete  our  construction  by 
adding  to  the  network,  as  follows:  for  k  —  we  add  the  2n-ary  edge 

=  {'*^(k-i)n+i,  ■  ■  •  :Wkn,vi, . . .  labeled  with  Tk  (which  is  induced  by 

some  constraint  network,  by  Lemma  19).  (Figure  5  illustrates  this  construction 
for  n  =  3  and  s  =  3.) 

We  now  prove  that  the  constraint  induced  on  V  by  the  constructed  network  is 
indeed  the  complement  of  r.  First,  suppose  that  a  certain  assignment  of  colors  to 
vi,,..,Vn  satisfies  the  complement  of  r  on  V.  Assign  to  U7i, . . . ,  w,n  any  tuple  of 
f.  It  is  readily  verified  that  the  color  assignment  thus  obtained  is  a  solution  of  the 
constraint  satisfaction  problem  on  the  constructed  network.  Conversely,  consider 
a  solution  of  the  constraint  satisfaction  problem  on  the  constructed  network.  In 
this  solution,  the  assignment  of  colors  to  tui, . . . ,  is  a  concatenation  of  the 
tuples  of  r,  since  the  edge  W  is  constrained  by  f.  Let  be  the  tuples 

of  r  in  the  order  they  occur  in  the  present  coloring.  In  particular, 
satisfy  ri, . . . ,  r,,  respectively.  For  each  k  =  1, . . . ,  5,  the  assignment  of  colors  to 
^1  j  ♦  •  •  j  yields  an  n-ary  tuple  different  from  tk .  Hence  the  assignment  of  colors 
to  vi, . .  .^Vn  satisfies  the  constraint  imposed  by  the  complement  of  r. 


6  Conclusion 


The  results  in  this  paper  highlight  once  again  how  fruitful  the  interaction  be¬ 
tween  constraint  satisfaction  problems  and  database  theory  can  be.  In  previous 
papers  [6,  9]  we  applied  results  from  database  theory  to  obtain  new  results  about 


148 


constraint  satisfaction  problems.  In  contrast,  the  results  in  this  paper  were  ob¬ 
tained  by  linking  recent  results  on  constraint  satisfaction  to  relational  database 
theory. 

These  results  also  show  that  the  expressive  power  of  a  set  of  constraints  is  very 
closely  related  to  the  closure  properties  of  the  constraints  as  described  in  Defini¬ 
tion  8.  The  authors  are  now  working  on  generalizing  the  construction  exhibited 
in  this  paper  to  arbitrary  sets  of  constraints  (with  or  without  disequality). 
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Abstract.  Recent  improvements  in  constraint  programming  have  made 
it  possible  to  tackle  hard  problems  in  a  practical  way.  Before  this,  these 
problems  were  solved  only  by  specialized  programs  often  complex  to  im¬ 
plement.  Scheduling  problems  and  more  especially  the  job- shop  problem 
belong  to  this  class.  In  this  paper  we  explain  a  relatively  simple  con¬ 
straint  system,  which  enables  us  to  solve  10  x  10  problems  efficiently. 
The  method  described  here,  based  on  evaluations  which  come  as  close 
as  possible  to  release  and  due  dates  of  jobs  to  be  scheduled,  requires 
no  prior  knowledge  of  the  problem  being  processed,  in  particular,  no 
bounds  over  optimum  value  (consequently  no  specific  algorithm  to  find 
approximate  solutions).  We  also  comment  on  the  resrdts  of  experiments 
on  known  problems.  As  far  as  we  know,  the  system  outhned  here  is  the 
only  one  that,  using  just  constraint  solving  and  an  exhaustive  enumera¬ 
tion  strategy,  can  completely  solve  orb3[AC91]  in  less  than  half  an  hotir 
computational  time. 

Keywords:  Job-Shop  Scheduling,  Constraint  Programming,  Efficiency. 


Introduction 

The  disjunctive  scheduling  problem  or  job-shop  problem  is  an  NP-hard  problem 
of  which  the  MT10[MT63]  is  an  example  has  been  solved  for  the  first  time  20 
years  after  its  presentation[CP89].  If  at  first,  this  type  of  problem  was  consid¬ 
ered  to  be  an  operational  research  (OR)  speciality,  improvements  in  constraint 
programming  have  lent  impetus  to  other  developments  around  languages  of  arti¬ 
ficial  intelligence [AB 92,  CL94,  BLPN95].  The  latter,  even  if  they  do  not  always 
present  performances  equal  to  the  ones  of  OR  products  [CP94],  have  on  the  other 
hand  the  advantage  of  being  easier  to  implement  and  to  adapt  because  of  the 
expressive  powers  of  constraint  programming.  In  order  to  reach  performances  of 
specific  developments,  some  languages  possess  specialized  procedures  and/or  use 
approximate  techniques  (simulated  annealing,  stochastic  algorithms...).  A  draw¬ 
back  is  that  the  former  generality  of  these  languages  is  restricted  when  using 
these  extensions. 
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In  this  paper  we  describe  several  mechanisms  that  provide  a  standard  con¬ 
straint  solver  with  the  means  to  tackle  these  scheduling  problems.  The  method 
suggested  rests  on  bounding,  as  exact  as  possible,  of  values  of  variables  and 
on  the  use  of  immediate  selections  [CP  8  9].  Moreover,  the  enumeration  strategy 
which  we  have  used  allows  us  to  completely  solve  10  x  10  problems.  We  then 
use  some  well  known  examples  that  highlight  the  efficiency  of  our  approach. 
Although  relatively  simple,  our  algorithm  offers  performances  which  are  compa¬ 
rable  and,  indeed  even  superior,  to  other  achievements  in  this  field. 

The  paper  is  organized  as  follows:  the  first  part  briefly  presents  principles 
of  constraint  programming  and  describes  in  detail  different  components  of  a 
scheduling  problem  of  the  job-shop  type.  The  second  part  explains  several  prop¬ 
erties  of  the  problem  intended  to  complete  the  initial  constraint  system.  The 
third  part  presents  the  constraint  solver  used,  its  employment  and  the  selected 
enumeration  strategy.  Finally,  experimental  results  are  discussed  in  the  last  part. 

1  Job-Shop  and  Constraint  System 

1.1  Constraint  Solving 

Constraint  programming  consists  essentially  in  describing  problems  by  means 
of  relations  between  variables[VH89].  The  constraint  solver  has  to  make  the 
constraint  system  consistent,  in  order  to  ensure  that  a  solution  is  possible  for  a 
given  set  of  constraints  (arc  consistency  [Mac77]).  Thus,  it  can  give  a  solution  only 
if  the  constraint  system  is  strong  enough.  Consequently,  the  difficulty  implied  by 
such  a  technique  is  in  defining  a  constraint  system  capable  of  leading  the  solver 
to  a  solution.  This  description  may  be  divided  into  three  main  parts: 

1.  an  initial  system  of  elementary  constraints  describing  problem  data,  which 
is  directly  built  from  the  problem  definition; 

2.  a  global  constraint  system  not  directly  implied  by  the  initial  system,  involv¬ 
ing  exploitation  of  the  properties  of  the  problem; 

3.  an  enumeration  algorithm  and  its  heuristics  to  isolate  effective  solutions. 

The  last  part  is  needed  because  whatever  the  quality  of  the  constraint  system 
may  be,  some  variables  cannot  be  found.  Indeed,  in  most  cases,  several  equiva¬ 
lent  solutions  can  be  exhibited. 

Therefore,  the  efficiency  of  a  solving  algorithm  in  this  particular  case  essentially 
depends  both  on  the  “power”  of  the  global  system  and  on  the  quality  of  the  enu¬ 
meration  strategy.  Obviously,  we  want  such  a  strategy  to  construct  the  smallest 
possible  search  tree  (in  terms  of  the  total  number  of  nodes). 

1.2  The  Job-Shop  Scheduling  Problem 

A  job-shop  scheduling  problem  is  defined  by  a  set  of  n  jobs  which  has  to  be  exe¬ 
cuted  on  m  machines.  Each  job  consists  of  a  sequence  of  m  operations  assigned 
to  the  m  machines.  The  objective  is  to  find  the  shortest  possible  schedule  (i.e. 
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a  processing  order  for  all  operations)  considering  that  each  machine  can  handle, 
at  most,  one  job  at  a  time  and  preemption  is  not  allowed  (each  operation  which 
is  begun  must  be  ended  without  interruption). 

A  n  X  m  problem  will  designate  a  job-shop  of  n  jobs  for  m  machines  that  repre¬ 
sents  n  X  m  operations  to  schedule. 

Each  of  n  jobs  Ti  has  a  specified  processing  order  <7j  —  (<t/,  ...,crj^)  on  the 
m  machines.  A  job  Tj  is  represented  for  each  machine  k  by  the  operation  . 
Each  operation  is  defined  by  its  duration  pf ,  the  date  from  which  it  is  ready 
to  be  executed  rf  (release  date)  and  the  date  after  which  it  must  be  finished  df 
(due  date).  All  the  are  integers.  The  pf  are  given  whereas  the 

are  unknown. 


r  [ 


P 

]d 


Fig.  1.  An  operation 


In  the  following,  will  mean  operation  precedes  operation  on  a 

given  machine  k.  Moreover,  to  simplify  notation,  references  to  machines  will  be 
omitted  where  there  no  ambiguity,  so  Ti  +  p,-  <  rj  should  be  understood  as 
+  P?  ^  f'j  for  any  k. 

The  job-shop  problem  is  easy  to  explain  with  relations  between  various  initial 
data: 

-  relation  between  release  date  and  due  date  of  the  same  operation; 

'ik  e  {l..m},Vz  G  {l.-n},  rf  +pf  <  df  (1) 

-  order  between  operations  belonging  to  a  given  job; 

Vi  G  k  G  {l..m  -  1},  d- ’  <r-^  (2) 

-  mutual  exclusion  between  operations  belonging  to  a  given  machine. 

VP  G  Vi  G  j  /  i  G  [df  <  0  [d^  <  rf ]  (3) 

Although  it  expresses  all  problem  data,  this  constraint  set  is  not  sufficient  for  the 
constraint  solver  to  find  a  solution:  at  the  most  it  makes  it  possible  to  validate  a 
proposal.  This  occurs  for  two  main  reasons.  On  the  one  hand,  the  local  nature  of 
solving  mechanisms  (arc  consistency)  makes  it  possible  to  propagate  information 
only  between  variables  directly  linked  through  relations.  The  following  example 
illustrates  what  this  particularity  involves:  if  two  operations  ol  and  o2  not  yet 
ordered  precede  a  third  o3,  the  release  date  lower  bound  of  the  latter  will  be 
raised  to  max(roi  +Poi)^’t>2  +P02)-  As  in  every  case  these  two  operations  will 
be  executed  before  o3,  it  is  obvious  that  >  min(ro^,  foa)  +Poi  +Po2'  As  long 
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as  the  disjunction  ol/o2  is  not  decided,  the  proposed  constraint  system  cannot 
deduce  this  relation  so,  the  release  date  lower  bound  may  be  easily  improved. 
On  the  other  hand,  for  a  given  cost  (i.e.  scheduling  duration)  several  solutions 
can  be  exhibited.  Indeed,  shifting  or  swapping  some  operations  can  produce  dif¬ 
ferent  solutions  with  the  same  cost.  In  this  case,  the  work  of  the  solver  has  to  be 
completed  by  an  enumeration  step  because  it  is  unable  to  make  choices  (cf.  3.4). 
However,  solver  precision  can  be  improved  using  constraints  over  groups  of  vari¬ 
ables.  In  the  following  section,  we  shall  study  some  properties  of  this  particular 
problem  in  order  to  define  these  global  constraints.  These  constraints  belong  to 
two  categories:  the  first  one  improves  values  of  release  and  due  dates  making 
them  more  precise  whereas  the  second  one  detects  which  disjunctions  accept 
only  one  order  (immediate  selections). 

2  The  Global  System 

In  this  part  we  present  several  properties  of  the  initial  problem  which  are  not 
implied  by  the  solver  system.  These  properties  are  relations  shared  by  groups 
of  variables  associated  with  concurrent  operations  over  a  given  machine.  Three 
levels  of  the  global  constraint  system  are  concerned  by  this: 

1.  improvement  of  precision  of  problem  variables; 

2.  detection  of  configurations  that  do  not  accept  any  solutions; 

3.  deciding  between  disjunctions  that  accept  only  one  direction. 

Thus,  we  assume  problem  data  is  examined  for  a  partially  established  schedule. 

2.1  Processing  Time  Estimation 

As  for  a  single  operation,  a  set  J  of  operations  (over  a  given  machine)  is  forced 
to  be  executed  in  a  determined  space  of  time.  In  the  first  estimation,  this  space 
is  bounded  by  the  release  date  rj  of  the  set  and  its  due  date  dj. 

rj  =  min(ri) 
ieJ 

dj  =  max(d,) 
i£j 

The  objective  is  to  calculate  the  minimal  duration  a  machine  needs  to  execute 
this  operation  set  in  order  to  improve  these  two  dates. 


Function  E.  The  activity  duration  is  at  least  equal  to  the  sum  of  durations  of 
the  operations  to  be  treated.  From  this  we  deduce  a  natural  lower  bound  of  the 
schedule  of  J. 


dj  >  min 
ieJ 


{ri)  +  Y,pj 
3€J 


In  this  case  the  machine  is  assumed  to  handle  all  operations  without  interrup¬ 
tion;  nevertheless  some  operations,  not  yet  released,  force  it  to  stop  for  waiting 
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periods  during  which  it  is  inactive.  Adding  these  pause  periods  to  the  sum  of 
the  durations  gives  a  better  estimation  for  the  total  treatment  duration.  This 
second  estimation  is  obtained  executing  the  operations  in  increasing  release  date 
order  without  ignoring  constraints  over  these  dates. 


Fig.  2.  Schedule  at  the  earliest 


In  the  example  of  Fig. 2,  EJl  is  the  sum  of  operation  durations  and  EJ2  the 
scheduling  “at  the  earliest” ,  The  fourth  operation  forces  the  machine  to  stop  for 
a  waiting  period  which  is  taken  into  account  by  EJ2  but  not  by  EJl. 

The  function  S{J)  computes  the  schedule  “at  the  earliest”. 
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The  calculus  of  the  schedule  “at  the  latest”  gives  a  better  bound  of  this  date.  It 
is  computed  processing  operations  in  decreasing  due  date  order  respecting  these 
date  constraints. 


J 


DJI 


DJ2 


Fig.  3.  Schedule  at  the  latest 


This  second  example  (Fig. 3)  has  two  waiting  periods  in  its  schedule  “at  the 
latest”  (DJ2).  Once  again,  a  simple  sum  of  operation  durations  is  not  precise 
enough  (DJI). 

The  function  T>(J)  processes  the  schedule  “at  the  latest”. 


Function  X>(J):Date 

{ 

j'  ^0 
<  00 

While  {J\J'  0) 

{ 

i  such  as  di  =  max  (dj) 


r  {i} 

} 

} 

Return  (li) 


Computation  of  V{J) 

Property  2  The  function  'D(J )  is  an  upper  hound  of  the  set  J  release  date. 

rj  <  V{J) 


2.2  Maximum  Delay 

The  maximum  delay  6  of  a,  set  J  denotes  the  difference  between  the  total  amount 
of  time  allotted  to  the  machine  to  process  all  operations  of  J  and  the  estimated 
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duration  of  this  treatment^ . 

=  [dj  -  rj)  -  y^^Pi 
ieJ 

The  maximum  delay  can  be  likened  to  a  measure  of  degree  of  freedom  of  the 
machine,  the  greater  is,  the  greater  the  freedom  to  the  machine  is  to  start 
its  processing.  Mutually  the  less  S  is,  the  more  the  machine  can  be  considered 
as  constrained.  In  the  same  way  that  release/due  dates  have  been  bounded,  the 
value  of  S  can  be  estimated  more  precisely.  To  do  this,  we  have  to  first  compare, 
the  valuation  of  the  schedule  “at  the  earliest”  with  the  release  date  of  J  then, 
the  valuation  of  the  schedule  “at  the  latest”  with  the  due  date  of  J. 

6j  <  max(di)  —  8 [J) 

iej 

Sj  <  V{J)  —  min(rj) 

i^J 

Consequently  we  select  for  the  value  of  8  the  minimum  of  these  two  estimations: 
Sj  <  min(max(di)  —  8{J),V{J)  —  min(rf)) 

In  the  example  of  Fig. 4,  we  have  RJ\  >  RJ^  >  -RJ3  thus  we  deduce  8j  <  RJZ. 


- ^FTTTTl 

*•  RJ3 


Fig.  4.  The  three  valuations  of  the  maximum  delay 


Property  3  If  the  maximum  delay  is  negative,  then  the  current  configuration 
does  not  accept  any  solution. 

2.3  Immediate  Selections 

In  this  section  we  recall  part  of  the  immediate  selection  algorithms  from  Car- 
lier  &  Pinson[CP89],  The  general  method  consists  in  considering  each  machine 
individually  and  trying  to  determine  disjunctions  that  accept  only  one  direc¬ 
tion.  Furthermore,  at  each  step  we  verify  the  existence  of  feasible  schedules  with 
current  data  in  order  to  identify  configurations  which  cannot  provide  a  solution. 

Four  Words  of  Vocabulary...  An  operation  is  said  to  be  totally  ordered  if 
there  exists  no  disjunction  over  this  operation  not  yet  decided. 

A  clique  denotes  a  set  of  operations  not  totally  ordered  attached  to  a  given 
machine  (or  sharing  a  given  resource).  The  input  of  a  clique  is  the  operation 
that  begins  the  schedule  of  the  clique  and  the  output  the  one  that  finishes  it. 

^  It  is  also  called  “slack” 
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Input  and  Output  of  Clique  Detection.  For  a  given  clique  I  we  have  to 
select  which  operation  must  be  the  input  of  the  clique  and/or  which  operation 
has  to  be  the  output. 

With  an  aim  to  identifying  operations  that  cannot  take  the  lead  of  clique  /;  we 
compute  for  each  operation  i  a  lower  bound  of  the  due  date  of  the  set  if  i  is 
the  first  processed  operation.  If  this  valuation  exceeds  the  latest  due  date  of  I 
(except  i  because  it  is  assumed  to  be  at  the  head)  the  operation  i  can’t  precede 
the  set. 


ri 


+  > 
ier 


max  d,- 
jesr\{i}  ^ 


(4) 


Given  set  Ej  initialized  with  I  and  reduced  with  test  (4).  Ej  describes  the  set 
of  operations  candidates  for  the  input  of  clique  /.  Similarly,  we  construct  the 
set  Sj  of  task  candidates  for  the  output  of  the  clique  I  evaluating  upper  bounds 
of  the  release  date  of  I  for  each  i  when  this  one  is  the  last  processed  operation 
and  comparing  it  with  the  earliest  release  date  of  I  (except  i) .  The  relation  (5) 
qualifies  inappropriate  candidates  for  the  output  of  1. 


di  -  < 

jei 


min  r, 

jeEj\{i} 


(5) 


Property  4  If  the  set  Ej  amounts  to  a  single  element,  then  this  element  must 
be  the  input  of  clique  I  for  any  solution.  Similarly,  if  the  set  Sj  amounts  to  a 
single  element,  this  element  must  be  the  output  of  I  for  any  solution. 


Property  5  If  one  of  the  sets  Ej  or  Sj  is  empty  (1^0)  then  no  solution  can 
be  built  with  current  configuration. 


3  Implementation 

The  properties  mentioned  above  have  led  us  to  describe  a  constraint  set  which  has 
been  realized  by  means  of  a  solver  system  based  on  interval  arithmetic[BOV94]. 
This  makes  it  possible  to  link  variables  through  ordinary  relations  such  as  — 
x-\-y,  X  <  y... 

The  representation  of  boolean  expressions  is  provided  by  intervals  [0, 1]  which 
authorize  a  logic  of  three  values  (0:false,  l:true,  [0, 1]:  indeterminate). 

As  well  as  the  classical  functionalities  for  this  kind  of  tool,  our  constraint  solver 
prototype  also  offers  the  means  to  describe  sets  and  constraints  over  sets.  This 
particularity  simplifies  appreciably  the  implementation  of  immediate  selections. 

This  section  describes  the  job-shop  problem  in  terms  of  constraints  and  gives 
the  enumeration  strategy  used  in  experimentations. 

3.1  Basic  Constraints 

All  the  relations  described  in  section  1  are  copied  into  the  constraint  system  as 
they  are.  Nevertheless  it  should  be  noted  that  disjunctions  (3)  are  represented  by 
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three  logic  expressions  associated  with  variables  having  their  truth  value.  Thus, 
for  each  pair  of  operations  i  and  we  have: 

(  Dij  =  [di  <  Vj] 

7  Dji  =  \dj  <  Vi] 

[  Dij  =  ^Dji 

If  Dij  takes  the  true  value,  then  Dji  becomes  false  and  operation  i  is  processed 
before  operation  j. 

Dij 

3.2  Updating  Release  and  Due  Dates 

Let  us  consider  an  operation  i  and  build  the  set  Pi  of  its  predecessors  and  the 
set  Si  of  its  successors.  Relations  of  type  (3)  shared  between  Si  and  i  then  P, 
and  i  are  converted  into  precedence  relations: 

Vj  G  Si,  di  <  Tj 

Vj  G  Pi,  dj  <  Vi 

The  combined  action  of  these  two  constraint  groups  enables  us  to  verify  the 
following  relations: 

n  >  maxfd,-) 

“  jePi 

di  <  minfr,) 

-  jesP 

Such  a  constraint  system  is  not  able  to  find  good  bounds  for  due  and  release 
dates;  consequently,  to  improve  these  evaluations,  we  complete  it  with  constraints 
built  with  the  help  of  functions  S  and  V.  These  last  two  give  much  better  esti¬ 
mations  of  the  dates  (Prop.  1  and  2): 


n  >  £{Pi) 

(6) 

di  <  T>(Si) 

(7) 

3.3  The  Sets 

Using  sets  of  integers  is  an  efficient  and  practical  way  to  implement  immediate 
selection  algorithms.  Indeed,  each  element  of  a  set  (an  integer)  is  associated 
with  a  particular  task  and  its  properties  over  the  studied  resource  are  stored  as 
vectors  such  as  r  (release  dates)  or  d  (due  dates).  Partial  orders  are  taken  from 
a  boolean  matrix  D  (see  above  3.1). 

In  this  section  we  assume  all  information  to  be  attached  to  a  given  machine, 
thus  the  following  statements  have  to  be  done  for  each  machine  individually  and 
are  sufficient. 
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Sets  Definition.  Given  the  set  neo  of  not  yet  ordered  operations  (i.e.  a  clique): 

neo  :  {l..n}\{i  G  {l..n}|Vi  /  i  E  {i  ^  j)  V  {j  -<  i)} 

Definitions  set  forth  in  the  previous  section  do  not  take  into  account  the  fact  that 
a  part  of  the  disjunctions  of  an  operation  set  may  already  be  decided.  According 
to  this  information,  we  can  add  two  complementary  subsets  of  neo:  the  set  ei 
describes  operations  that  could  be  the  input  of  the  clique  because  there  is  no 
operation  to  process  before;  and  si  characterizes  operations  that  could  be  the 
output  because  they  have  no  successor. 

ei  :  neo\{i  E  neo\3j  ^  i  E  neo^j  -<  z} 

Si  :  neo\{z  E  neo\3j  z  E  neo,  i  ^  j} 

Moreover,  the  making  up  of  sets  of  candidates  for  the  input/output  of  the  clique 
is  based  on  the  estimation  of  time  required  for  treatment  of  the  whole  of  the 
clique.  Actually  other  operations,  not  belonging  to  the  clique  because  they  are 
totally  ordered,  have  to  be  processed  in  the  same  span  of  time.  To  obtain  good 
valuations  of  execution  duration  we  complete  neo  with  these  operations.  The  set 
at  is  composed  of  operations  to  be  processed  in  the  time  allotted  to  the  machine 
to  treat  neo.  It  is  built  adding  to  neo  all  operations  with  at  least  one  predecessor 
belonging  to  ei\si  and  at  least  one  successor  belonging  to  Si\ei. 

at  :  neo\j{i  E  {1 . .n}\neo\3j  E  ei\si,3k  E  siVi,  (j  k)  A  {j  -<  i)  A  (z  -<  k)} 

Intermediate  sets  ei,  and  at  are  used  as  a  base  to  make  up  the  sets  of  candi¬ 
dates  for  input  and  output  ee  and  es. 

ee  :  ei\{z  E  ei|ri -h  ^  Pj  >  max  (4)} 

jea. 


es  :  si\{z  E  si  |dj  —  E  Pj 


jeat 


A:€ee\{t} 


Constraints  Over  Sets.  While  neo  is  not  empty,  sets  ee  and  es  have  to  contain 
at  least  one  element  (Prop.  5). 

neo  ^  0  A  (ee  =  0  V  es  =  0)  =>  Fail 

The  ee  or  es  unique  element  determines  clique  input  or  output  (Prop.  4). 
card(ee)  =  1  Vz  E  ee,  Vj  E  neo\ee,  i  j 

card(es)  =  1  Vz  E  es,  Vj  E  neo\es,j  z 

Maximum  delay  estimation  must  be  positive  for  each  set  neo  (Prop.  3). 


^neo  ^  Fail 
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3.4  Enumeration  Strategy 


For  a  job-shop  scheduling  problem,  a  solution  can  be  characterized  by  one  of 
two  variable  sets:  first,  we  can  choose  directly  the  release  dates  in  which  case, 
the  solution  is  the  list  of  beginning  dates  of  each  operation;  or  we  can  examine 
mutually  exclusive  constraints  and  the  solution  is  described  by  a  list  of  decided 
disjunctions.  For  this  last  case,  the  one  we  have  selected  here,  the  solution  is 
a  passage  order  of  tasks  defined  by  the  list  of  ordered  operation  pairs  for  each 
machine  (ex:  2  ^  1, 2  4...).  This  list  is  embodied  by  boolean  variables  which  all 

receive  a  value  when  a  solution  is  reached.  For  a  problem  of  n  x  m,  m  x 
variables  must  be  found.  This  gives  the  search  space  a  dimension  of  2^^  ^2”  ^ 

combinations^. 

At  each  choice  point,  a  pair  of  non-ordered  operations  must  be  selected  (i.e. 
choose  one  of  not  found  boolean  variables)  next  a  passage  order  has  to  be  picked 
(i.e.  fix  a  truth  value  to  the  variable).  It  is  worth  noting  that  because  of  the  size  of 
the  search  space,  the  behavior  of  the  enumeration  procedure  is  determining.  Two 
criteria  are  consequently  needed.  The  first  one  is  to  elect  the  machine  on  which 
a  decision  has  to  be  made  and  the  second  is  to  choose  an  operation  pair  to  order 
and  a  mean  to  fix  the  direction  of  the  disjunction  (however  other  approaches 
could  be  used  [Col96]). 


Choice  of  a  Machine.  The  aim  of  this  first  criterion  is  to  designate  the  machine 
which  will  have,  a  priori,  the  strongest  incidence  over  the  rest  of  the  system.  In 
other  words  the  machine  for  which  the  schedule  should  involve  the  greatest 
number  of  deductions  (from  the  constraint  system).  This  is  why  we  first  choose 
machines  with  lower  maximum  delay  Sneo-  In  case  of  a  tie,  the  machine  which 
has  the  smallest  set  neo  is  chosen. 


Choice  of  a  Pair  of  Operations.  On  the  selected  machine,  a  disjunction  must 
be  chosen.  This  choice  is  done  in  the  set  neo  (assigned  to  the  selected  machine) 
among  non-ordered  disjunctions.  To  “help”  the  system,  we  choose  first  a  pair 
which  seems  to  be  difficult  to  decide  between.  Thus,  for  a  given  set  J  we  take 
the  pair  (i,  j)  that  minimizes  the  difference  between  the  release  date  of  the  first 
and  the  due  date  of  the  second: 

The  disjunction  will  take  the  direction  i  <  j. 

Optimality  Search.  For  this  kind  of  problem,  two  search  techniques  can  be 
employed  regarding  optimality: 

^  An  important  amoimt  of  combinations  are  inconsistent  (such  as  1  ^  2,2  X  3  with 
3  -X  1)  and  are  not  explored. 
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—  Min-Max:  for  each  found  solution  S  with  cost  Cs,  the  enumeration  proce¬ 
dure  is  re-initialized  with  the  new  constraint:  C  <  Cs] 

-  Minimize:  we  use  a  specialized  predicate  minimize{C)  which  allows,  each 
time  a  solution  S  is  found,  for  backtracking  into  the  search  tree  until  the 
constraint  (7  <  Cs-  is  accepted  (i.e.  it  does  not  cause  the  system  to  fail) 
and  then  added.  Next  the  enumeration  continues  with  the  new  constraint 
system. 

This  second  solution  gives  better  results  with  the  selected  enumeration.  It  should 
also  be  noted  that  whatever  the  technique  employed  is,  an  initial  upper  bound 
may  be  fixed  (i.e.  an  initial  cost  Ci)  in  order  to  reduce  the  search  time.  In  this 
case,  another  algorithm  is  needed  to  find  this  initial  cost... 

4  Experimental  Results 

The  test-set  is  made  up  of  18  problems  taken  from  literature  and  reputed  to 
be  hard  (orbl-10,  lal6-20,  abz5-6,  mtlO)  added  to  a  collection  of  50  randomly 
generated  problems  all  of  size  10  x  10.  The  resolution  of  each  problem  has  been 
entirely  submitted  to  our  system.  That  means  that  no  upper  bound  is  fixed  a 
priori  and  no  lower  bound  is  required.  Optimum  search  and  proof  of  optimality 
are  performed  in  only  one  execution  of  the  enumeration  procedure.  The  following 
table  summarizes  results  for  the  10  best  known  problems.  Times  are  expressed 
in  seconds  and  have  been  obtained  on  a  SparcStation  5.  The  first  three  columns 
refer  to  respectively  the  problem’s  usual  name,  its  optimal  cost  (Opt)  and  the 
first  found  solution  cost  (Soil).  The  next  two  columns  give  the  number  of  choice 
points  (PcOpt)  and  the  time  (Topt)  needed  to  reach  an  optimal  solution.  Finally, 
the  two  last  columns  give  the  total  number  of  choice  points  (PcTot)  and  the  total 
amount  of  time  (Ttot)  to  find  the  best  solution  and  to  prove  its  optimality. 
Note  that  the  first  solution  is  reached  in  less  than  one  second  in  all  cases. 


Table  1.  Resolution  for  10  well  known  problems 


Problem 

Opt 

Soil 

PcOpt 

Topt(s) 

PcTot 

Ttot(s) 

mtlO 

930 

1028 

5178 

55 

10991 

116 

orbl 

1059 

1335 

43592 

431 

48842 

495 

orb2 

888 

991 

5709 

49 

8032 

71 

orbs 

1005 

1194 

120064 

1221 

135394 

1393 

orb4 

1005 

1193 

2626 

21 

6538 

64 

orb5 

887 

989 

5878 

50 

6936 

62 

abz5 

1234 

1468 

169430 

325 

172162 

350 

abz6 

■  943 

1053 

1140 

11 

1536 

15 

lal9 

842 

946 

3023 

21 

5809 

45 

la20 

902 

964 

14337 

109 

15434 

120 

Among  the  50  random  problems,  only  2  need  more  than  2  minutes  of  com¬ 
putational  time;  problems  generated  by  these  means  seem  to  be  clearly  easier! 
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Moreover,  computational  times  required  by  all  other  problems,  not  mentioned 
in  the  above  table,  belong  to  a  range  from  3s  to  140s. 

Although  no  particular  technique  is  employed  with  this  aim,  the  quality  of  the 
first  solution  cost  is  often  about  10%  of  the  optimum.  Also,  we  have  observed 
that  the  search  is  performed  step  by  step,  i.e.  intermediate  solutions  are  clustered 
and  each  group  of  solutions  needs  a  certain  amount  of  computational  time  but 
all  solutions  of  a  given  group  are  found  quickly.  This  phenomenon  is  particularly 
significant  for  the  abz5  problem:  after  5s  there  is  a  solution  at  1392;  then  at 
30s  a  second  stage  stops  at  1357;  the  next  step  is  reached  after  300s  for  a  cost 
of  1346  which  is  converted  to  1242;  the  last  step  begins  40s  later  for  a  cost  of 
1239  immediately  followed  by  the  optimum  1234.  From  this  remark,  we  deduce 
that  the  quality  of  an  upper  bound  depends  on  the  group  of  solutions  it  appears 
in  (two  values  extracted  from  the  same  cluster  certainty  give  the  same  results): 
for  the  abz5  example,  there  are  four  groups.  First  over  1391,  then  between  1391 
and  1357,  then  between  1356  and  1242  and  to  finish  between  1241  and  1234.  Of 
course  the  nearer  to  optimum  the  upper  bound  is  the  better  the  result  is  but 
differences  between  values  of  a  given  group  are  not  significant. 

In  order  to  compare  our  algorithm  to  that  of  Applegate  k  Cook,  we  have 
also  executed  resolutions  employing  such  upper  bounds  as  put  forth  in  [AC91]. 


Table  2.  Resolution  for  10  well  known  problems  with  upper  bounds [AC91] 


Problem 

ISI33 

Bsup 

PcTot 

Ttot(s) 

PcTot  [AC91] 

Ttot[AC91](s) 

mtlO 

930 

930 

74 

16055 

372 

orbl 

1059 

1070 

306 

71812 

orb2 

888 

890 

4539 

40 

153578 

orbs 

1005 

1021 

113549 

1248 

130181 

orb4 

1005 

1019 

4509 

47 

44547 

1013 

orb5 

887 

896 

5483 

52 

23113 

526 

abz5 

1234 

1245 

8371 

68 

57848 

951 

abz6 

943 

943 

527 

5 

1269 

90 

lal9 

842 

848 

3549 

28 

93807 

1462 

la20 

902 

911 

12195 

89 

81918 

1402 

Even  if  times  cannot  be  seriously  compared  (the  computers  used  are  probably 
too  different),  the  number  of  choice  points  can  be  taken  as  criterion  and  over 
these  examples,  our  system  offers  better  performances. 

Table  3  shows  the  times  required  to  completely  solve  (optimum+proof)  the 
10  problems  by  IlogSchedule[BLPN95]  and  Task  Intervals  system[CL95].  The 
times,  expressed  in  seconds,  were  obtained  on  a  RS6000  computer  for  the  first 
column  and  on  a  Pentium  90  for  the  second  column.  The  last  column  of  the  table 
shows  our  performances  on  SparcStation  5. 
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Table  3.  Compared  performances 


Problem 

[BLPN95] 

[CL95] 

LocPerf 

mtlO 

235 

151 

116 

orbl 

407 

189 

495 

orb2 

507 

31 

71 

orb3 

606 

588 

1393 

orb4 

213 

371 

64 

orbs 

210 

89 

62 

abzS 

282 

127 

350 

abz6 

100 

8 

15 

lal9 

269 

100 

45 

la20 

496 

78 

120 

These  two  other  techniques,  also  based  on  constraint  solving,  use  additional 
approximate  methods  to  compute  good  bounds  before  starting  enumeration  in 
order  to  reduce  the  search  tree.  This  is  why  the  numbers  of  choice  points  are  not 
indicated,  indeed  this  measure  is  significant  only  in  the  enumeration  procedure. 
These  pre-treatments  (one  step  for  IlogSchedule  and  two  steps  for  Task  Intervals) 
obviously  need  developments  much  more  complex  than  the  ones  suggested  in  this 
paper.  Our  algorithm  is  nevertheless  still  competitive. 

5  Conclusion 

We  have  presented  a  relatively  simple  algorithm  based  exclusively  on  constraint 
programming  which  is  capable  of  solving  10  x  10  job-shop  problems  efficiently. 
Based  on  the  estimation  of  release/due  dates  of  operations,  this  method  does  not 
depend  on  the  constraint  solver  used.  Moreover,  the  power  of  the  suggested  sys¬ 
tem  is  emphasized  by  the  rudimentary  nature  of  the  enumeration  strategy  which 
is  sufficient,  both,  for  finding  optimal  solution  of  the  problem,  and  for  proving 
its  optimality  without  any  prior  knowledge.  Although  appreciably  simpler,  this 
method  often  proves  to  be  more  efficient  than  other  techniques  in  the  same  field 
(constraint  solving);  and  remains  comparable  to  combined  techniques  (such  as 
approximate  -i-  enumerative) . 

In  spite  of  these  encouraging  results,  this  method  can’t  tackle  larger  problems. 
In  order  to  process  larger  problems,  we  are  at  present  working  on  an  extension  of 
the  set  system  and  we  are  aiming  toward  the  development  of  a  more  intelligent 
enumeration  strategy. 
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Abstract.  Constraint  propagation  algorithms  vary  in  the  strength  of 
propagation  they  apply.  This  paper  investigates  a  simple  configuration 
for  adaptive  propagation  -  the  process  of  varying  the  strength  of  prop¬ 
agation  to  reflect  the  dynamics  of  search.  We  focus  on  two  propagation 
methods,  Arc  Consistency  (AC)  and  Forward  Checking  (FC).  AC-based 
algorithms  apply  a  stronger  form  of  propagation  than  FC-based  algo¬ 
rithms;  they  invest  greater  computational  effort  to  detect  inconsistent 
values  earlier.  The  relative  payoff  of  maintaining  AC  during  search  as 
against  FC  may  vary  for  different  constraints  and  for  different  interme¬ 
diate  search  states.  We  present  a  scheme  for  Adaptive  Arc  Propagation 
(AAP)  that  allows  the  flexible  combination  of  the  two  methods.  Meta¬ 
level  reasoning  and  heuristics  are  used  to  dynamically  distribute  prop¬ 
agation  effort  between  the  two.  One  instance  of  AAP,  Anti-Functional 
Reduction  (AFR),  is  described  in  detail  here.  AFR  achieves  precisely  the 
same  propagation  as  a  pure  AC  algorithm  while  significantly  improving 
its  average  performance.  The  strategy  is  to  gradually  reduce  the  scope 
of  AC  propagation  during  backtrack  search  to  exclude  those  arcs  that 
may  be  subsequently  handled  as  effectively  by  FC.  Experimental  results 
confirm  the  power  of  AFR  and  the  validity  of  adaptive  propagation  in 
general. 


1  Introduction 

1.1  Background 

The  strategy  of  utilizing  meta-level  knowledge  inference  to  reduce  the  computa¬ 
tion  required  to  achieve  full  Arc  Consistency  in  a  constraint  network  has  been 
shown  to  be  successful  [15,  5,  2].  However,  the  algorithms  suggested  assume  that 
constraint  characteristics  are  known  at  the  start  of  the  search. 

Depth-first  backtracking  tree  search  enhanced  with  Forward  Checking  (FC) 
or  maintained  Arc  Consistency  (AC)  comprises  the  repeated  application  of  two 
alternating  steps:  labelling  (the  assignment  to  a  variable  of  a  value  from  its  do¬ 
main)  and  constraint  propagation  (the  deletion  of  inconsistent  values  from  the 
domains  of  unassigned  variables)  [6,  8,  9].  Hence  a  single  application  of  the  two- 
step  label-propagate  process  results  in  a  transformation  of  the  original  problem 
to  a  sub-problem  with  one  or  more  reduced  domains.  Any  search  decisions  (vari¬ 
able  assignments)  may  render  initial  meta-knowledge  (about  structure,  tightness, 
density,  etc.)  obsolete,  since  sub-problems  have  different  characteristics. 
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An  instance  demonstrating  the  effectiveness  of  using  up-to-date  meta-knowledge 
to  improve  the  performance  of  the  search  algorithm  is  found  in  the  popular  first 
fail  heuristic,  often  used  in  conjunction  with  FC  [8].  Other  dynamic  meta-level 
heuristics  have  been  found  to  achieve  significant  performance  gains  [13,  7].  All 
these  have  adapted  algorithm  behaviour  by  focussing  search  on  promising  sub¬ 
problems  through  variable  and  value  ordering.  As  far  as  we  know  little  or  no 
research  has  been  conducted  into  the  utility  of  dynamically  altering  the  ex¬ 
tent  of  propagation  deployed  by  the  algorithm.  The  purpose  of  this  paper  is 
to  demonstrate  that  meta-level  knowledge  about  the  problem  may  be  updated 
to  reflect  the  changes  brought  about  by  label-update  steps  and  used  locally  to 
select  appropriate  constraint  propagation  methods  and  heuristics  for  remaining 
sub-problems.  We  focus  in  particular  on  a  meta-knowledge  propagation  scheme 
Adaptive  Arc  Propagation  (AAP)  that  has  a  choice  of  propagation  methods 
at  its  disposal.  The  search  algorithm  initially  decides  a  status  for  each  binary 
constraint  arc  that  determines  how  changes  to  the  “source”  variable’s  domain 
are  propagated  onto  the  “destination”  variable.  In  the  case  of  full  AC  and  FC, 
this  means  that  full  propagation  occurs  on  any  change  and  on  instantiation  re¬ 
spectively.  As  the  labelling  and  propagation  take  place,  the  algorithm  updates 
meta-level  knowledge  and  uses  it  to  change  the  status  of  these  constraint  arcs. 

A  number  of  algorithms  implementing  a  hybrid  FC  and  AC  search  already 
exist.  These  hybridizations  have  been  static  in  the  sense  that  they  apply  FC  and 
AC  in  specific  phases  or  to  specific  sets  of  constraints,  and  do  not  update  their 
configuration  according  to  the  dynamics  of  the  search.  Such  hybridizations  seem 
to  be  less  effective  than  pure  AC  for  hard  problems  [11]. 

One  simple  instance  of  the  general  framework,  Anti-Functional  Reduction 
( AFR) ,  is  described  in  detail  here.  The  meta- level  reasoning  applied  allows  AAP 
to  achieve  precisely  the  same  propagation  as  a  pure  AC  algorithm  while  signifi¬ 
cantly  improving  its  average  performance.  This  is  achieved  by  gradually  reduc¬ 
ing  the  scope  of  AC  propagation  during  search  to  exclude  those  arcs  that  are 
handled  as  effectively  by  FC  propagation.  Experimental  results  confirm  large 
performance  gains  for  AFR,  indicating  the  validity  of  adaptive  propagation  and 
the  viability  of  more  sophisticated  AAP  instances. 


1.2  Paper  Outline 


The  paper  is  structured  in  three  descriptive  sections  followed  by  experimen¬ 
tal  results  and  a  conclusion.  Section  2  details  the  difference  between  AC  and 
FC  Propagation  to  suggest  how  the  appropriate  method  may  be  selected  for 
particular  constraint  arcs.  Adaptive  Arc  Propagation,  a  scheme  that  allows  the 
dynamic  integration  of  AC  and  FC  propagation,  is  described  in  Sect.  3.  Section  4 
then  gives  a  simple  instance  of  this  scheme  -  Anti-FunctionaJ  Reduction  (AFR) 
-  that  has  been  tested  experimentally.  The  results  for  AFR  (Sect.  5)  confirm 
large  performance  gains  when  applied  to  hard  problems.  Conclusions  are  drawn 
and  opportunities  for  further  research  are  outlined  in  Sect.  6. 
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2  Arc  Consistency  and  Forward  Checking 

Arc  Consistency  is  more  powerful  than  Forward  Checking  in  that  it  provides 
earlier  detection  of  inconsistent  values.  We  describe  the  exact  conditions  under 
which  Arc  Consistency  is  superior,  restricting  our  analysis  to  binary  constraint 
arcs. 


2.1  Basic  Notation 

For  a  definition  of  the  Constraint  Satisfaction  Problem  (CSP)  see  [14].  If  X  and 
Y  are  CSP  variables  we  define  the  following  notation: 

—  dom{X)  represents  the  domain  of  X] 

—  |X|  is  the  size  of  dom{X); 

—  Xi^l  <i  <\X\  are  the  elements  in  dom{X)\ 

—  r{X,  iCi)  is  a  reduction  operation  that  removes  Xi  from  the  domain  of  X  such 
that  the  new  domain  dam' {X)  =  (iom(X)\{a;j}; 

—  r{X,R)  represents  the  set  of  reduction  operations  given  by  {r{X^Xi)\xi  € 

Ry^ 

—  a  binary  constraint  canstrxY  on  X  and  Y  is  represented  by  the  two  directed 
arcs  arcxY  and  arcYX  corresponding  to  the  two  directions  in  which  the 
constraint  can  propagate. 


2.2  FC-  and  AC-monitoring 

We  compare  below  the  behaviour  of  AC  and  FC  on  a  directed  arc  arcxY  by 
counting  the  number  of  different  possible  reductions  on  X  for  which  AC  would 
yield  a  reduction  on  Y,  but  FC  would  not.  We  first  introduce  the  terminology 
used: 

-  An  arcxY  is  said  to  propagate  a  reduction  r{X,  R)  of  X  when  the  domain 
of  Y  is  reduced  to  exclude  those  values  inconsistent  with  X's  new  domain 
and  constrxY  (canstrYx)- 

-  An  arcxY  is  said  to  be  FC-monitored  in  a  propagation  sequence  if  it  is  used 
only  to  propagate  those  reductions  of  X  that  are  instantiations. 

-  An  arcxY  is  said  to  be  AC-monitored  if  it  propagates  any  reduction  of  X. 


2.3  When  Does  AC  Produce  More  Reductions  than  FC? 

Constraints  are  relations  and  may  be  represented  in  terms  of  a  relation  matrix. 
These  matrices  are  useful  for  analysing  the  effects  of  reductions. 

^  An  instantiation  of  A  -  when  its  domain  is  reduced  to  one  value  through  labelling 
or  propagation  -  thus  results  from  applying  any  set  of  reductions  r(X,  /)  such  that 
\I\  =  \X\-1. 
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Example  1.  A  constraint  constrxY  that  has  the  tuple  representation 

{  (^)^)  J  (^}b)  )  (^>^)  >(^)^)  )  } 

is  equivalent  to  the  relation  matrix  of  Table  1. 

In  the  table  Boolean  values  indicate  whether  a  given  pair  of  values  hold  in 
the  constraint  relation.  In  addition  the  example  matrix  is  annotated  with  the 
quantities  of  ones  in  a  given  row.  These  are  usually  referred  to  in  the  Arc 
Consistency  literature  as  the  support  and  denoted  here  by  SxiVi)  [10].  In  our 
comparison  of  Arc  Consistency  and  Forward  Checking  the  important  quantity  is 
the  number  of  zeros;  we  call  this  quantity  the  support-complement.  The  support- 
complement  SxiVi)  may  be  obtained  in  terms  of  SxiVi)  and  \X\: 

Sx{yi)  =  \X\-Sx{yi)  (1) 

Consider  the  impact  of  reductions  of  X  on  F  resulting  from  the  propagation 
of  arcxY-  AC-monitoring  is  superior  to  FC-monitoring  when  a  reduction  of  X 
that  is  not  an  instantiation  results  in  a  reduction  of  Y.  Those  values  of  Y  that 
have  two  or  more  zeros  in  their  row  are  the  only  candidates  for  deletion  in 
such  a  situation.  This  is  because  values  that  have  no  zeros  are  always  consistent 
with  the  constraint  arc,  while  those  with  a  single  zero  might  be  deleted  only 
on  the  instantiation  of  X.  The  condition  SxiVi)  >  1  thus  determines  whether 
AC-monitoring  of  arcxY  might  produce  a  reduction  of  yi  where  FC-monitoring 
would  not.  The  final  column  of  the  matrix  reflects  this  analysis  by  listing  the 
minimum  amount  of  propagation  required  to  maintain  the  Arc  Consistency  of  the 
domain  of  Y  with  respect  to  the  arc- value  pair  (arcxYiVi)',  some  arc- value  pairs 
need  more  propagation  than  others  to  maintain  an  Arc  Consistent  Y  domain. 


Table  1.  An  example  relation  matrix 


X 

a 

1 

1 

ones 

zeros 

Minimum  Monitoring  of  arcxY 
Required  for  Achieving  AC 

Ya 

ID 

B 

1 

2 

AC 

b 

ID 

B 

Q 

2 

1 

FC 

c 

ID 

□ 

Q 

3 

0 

None 

An  anti-functional  arc  arcxY  has  Vyi  :  SxiVi)  <  1-  The  class  of  con¬ 
straints  constrxY  that  have  anti-functional  arcxY  and  arcYx  are  known  as 
anti-functional  constraints  [15].  The  disequality  constraint  ^  is  an  example  of 
an  anti-functional  constraint. 


2.4  AC-superiority 

For  values  having  SxiVi)  >  1,  the  number  of  possible  reduction  sets  r(X,i?) 
that  are  not  instantiations  and  result  in  the  deletion  of  yi  in  AC-monitoring  but 
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not  in  FC-monitoring  is  given  by: 


ni 


Sx(yi)-2 


E  m 

r—0  '  ' 


(2) 


while  the  total  number  of  possible  reductions  that  are  not  instantiations  eind  are 
not  the  empty  set  is  given  by: 


n2  = 


-  1 


(3) 


Assuming  that  all  possible  reduction  sets  r{X,R)  are  equally  probable  in  a 
propagation  sequence  we  may  obtain  figures  for  the  probability  of  AC-monitoring 
achieving  a  reduction  where  FC-monitoring  would  not;  we  call  this  probability 
AC-superiority.  AC-superiority  depends  on  the  support-complement  of  yi  and 

ix|. 

Tl\ 

AC-superiorityiyi)  =  —  (4) 

n2 

Figure  1  shows  how  AC-superiority  varies  for  a  variable  X  of  domain  size  10 
according  to  the  support- complement  of  yi. 


No.  of  zeros  in  row  of  yl 

Fig.  1.  AC-superiority  vs.  support-complement  (lA"!  =  10) 


The  criterion  used  here  to  measure  Arc  Consistency  superiority  vis-a-vis  For¬ 
ward  Checking  was  the  number  of  propagated  reductions.  The  analysis  might 
be  used  to  identify  arcs  with  a  low  probability  of  achieving  superior  propagation 
with  AC-monitoring.  AC-superiority  figures  could  be  obtained  at  run-time  for 
particular  values  in  variable  domains  with  the  aid  of  a  preprocessed  table  indexed 
by  domain  size  and  support-complement.  A  decision  procedure  could  then  use 
these  figures  to  switch  arcs  propagating  onto  domains  with  low  AC-superiority 
values  from  AC-monitoring  to  FC-monitoring. 
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Anti-Functional  Reduction  (AFR),  the  technique  described  in  detail  in  Sect.  4 
does  not  apply  such  a  sophisticated  decision  procedure.  No  effort  is  made  to 
monitor  AC-superiority;  only  those  arcs  that  are  guaranteed  to  be  handled  as  ef¬ 
fectively  by  FC-monitoring  are  switched  from  AC-monitoring  to  FC-monitoring. 

3  Adaptive  Arc  Propagation 

Adaptive  Arc  Propagation  (AAP)  is  an  adaptive  propagation  scheme  that  makes 
use  of  arc  tags  to  maintain  recommended  propagation  methods  for  each  arc 
in  a  constraint  network.  These  propagation  methods  are  selected  by  meta-level 
knowledge  inference  or  heuristics.  Anti-Functional  Reduction  is  chosen  to  demon¬ 
strate  AAP  in  Sect.  4,  but  other  instances  could  also  have  been  applied.^ 

In  the  following,  all  procedures  and  functions  that  need  to  be  defined  by 
AAP  instances  are  subscripted  by  inst.  These  include  the  two  meta-knowledge 
maintenance  procedures,  InitListjpg^  and  ReviseListjpg^  that  create  and  modify 
the  set  of  arc  tag  lists  {1(X,  y)|(X,  V)  €  arcs{G)}. 

InitListjpig^  initializes  each  l(X,y);  it  might  assign  it  a  default  list  or  take 
into  account  existing  knowledge  about  the  arc  in  question.  During  backtrack 
labelling  and  propagation,  whenever  an  arc  is  selected  to  propagate  a  variable 
reduction  the  first  active  method  in  the  list  is  applied  -  a  method  is  active 
when  its  triggering  conditions  are  satisfied.  The  arc’s  list  is  then  revised  using 
the  second  function  ReviseListjpg^  to  take  into  account  the  changes  resulting 
from  the  propagation  step.  Hence  each  arc  propagation  step  involves  also  a  list 
revision  step.  This  gives  meta-knowledge  mechanisms  a  fine  grained  control  over 
propagation. 

In  the  interests  of  simplicity,  the  general  AC-5  scheme  has  been  adopted  in 
AAP.  An  even  more  general  scheme  such  as  AC-Inference  could  have  been  used 
as  a  basis  for  AAP;  it  allows  the  possibility  of  the  lazy  evaluation  of  constraints  to 
minimize  constraint  checks  [2].  However,  we  do  not  explore  this  possibility  here. 
This  is  justified  from  our  point  of  view  because  our  aim  is  to  reduce  the  time 
spent  on  backtrack  search  and  arc  consistency  maintenance  while  solving  large- 
scale  problems,  and  our  focus  has  been  on  applying  algorithms  that  are  AC-4 
instances  of  the  AC-5  scheme  [15].  While  AC-4’s  over-eager  constraint  checking 
has  drawn  deserved  criticism,  when  used  for  maintaining  arc  consistency  (MAC- 
4)  it  performs  all  constraint  checking  in  the  preprocessing  phase  avoiding  the 
need  for  explicit  constraint  checks  during  the  label-update  steps  of  backtracking 
search  [16,  1,  10,  11].  This  is  clearly  advantageous  when  trying  to  minimize  the 
backtrack  search  time. 


3.1  AAP  Initialization 

In  addition  to  initialising  the  network  and  the  propagation  method  data  struc¬ 
tures  the  initialization  scheme  of  AAP  applies  the  list  creation  function  InitListj^g^ 


^  Sect.  6.2  gives  one  other  example. 
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for  each  arc  (see  Fig.  2).  The  functions  InitMethodj^g^  ,  Remove  and  Enqueue  are 
similar  their  equivalents  in  the  AC-5  scheme.^  I  nit  Method  calls  the  appropri¬ 

ate  initialization  routine  of  the  algorithm  instance.  Remove  deletes  values  found 
to  be  inconsistent  by  InitMethodj^g^  from  the  relevant  domain  and  Enqueue  adds 
the  new  reductions  to  the  queue  of  processable  reductions.  InitListj^g^  uses  the 
domains  of  the  variables  after  propagation  and  the  arc  attributes  to  decide  the 
contents  of  the  arc’s  initial  method  list. 


begin  AAP-lnitialization 

Q  =  {}: 

for  each  {X,Y)  G  orc((?)  do 

begin 

InitMethodjnst  (X,y,Z\out); 
Remove(ziout,-Dy); 
Enqueue(y,  Z\out,Q): 
l(A,y)  =  lnitListi„5t 

end; 

end  AAP-lnitialization 


Fig.  2.  AAP-lnitialization 


3.2  AAP  Maintenance 

AAP  maintenance  allows  the  revision  of  the  lists  created  at  initialization  by 
InitListjf^g^  to  reflect  the  search  dynamics  (Fig.  3).  Dequeue  obtains  an  arc  and  a 
domain  reduction  that  requires  propagation.  One  difference  to  the  AC-5  scheme 
is  that  multiple  reductions  on  a  single  arc  are  presented  together  by  Dequeue.'^ 
The  propagation  procedure  used  is  given  in  Fig.  4.  ApplyActiveMethod  tra¬ 
verses  the  list  of  methods  associated  with  the  arc  applying  the  first  method  in 
the  list  for  which  the  Boolean  function  Activejpg^  returns  true.  This  enforces  the 
ordering  of  the  applicable  methods  implicit  in  the  list  sequence.  The  ordering 
is  important  since  propagation  methods  have  different  computational  overheads 
and  various  triggering  conditions  (see  Sect.  4).  The  method  chosen  returns  a 
possibly  empty  reduction  Aout  on  the  destination  variable  y.  The  domain  of  Y 
is  updated  before  the  meta-knowledge  function  ReviseListj^g^  is  called  to  revise 
the  list  of  recommended  methods.  Any  reductions  are  added  to  the  queue  as 
usual  by  Enqueue. 

^  In  the  pseudo-code  AC-5’s  axe  directions  axe  reversed  to  be  consistent  with  other 
notation. 

*  This  is  a  source  of  efficiency  since  reductions  on  a  single  arc  axe  handled  best  together, 
minimizing  queue  operations  and  allowing  the  quantity  of  reductions  to  be  measured. 
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begin  AAP-Maintenance 

while  not  EmptyQueue(Q)  do 
begin 

Dequeue(Q,X,y,  An); 

ApplyActiveMethod(l(X,  Y\X,  Y,  An,  ^out); 

Remove(/\out,i^r); 

l(X,y)  =  ReviseListinst(X,y); 

Enqueue(y,  Aut,Q) 

end 

end  AAP-Maintenance 


Fig.  3.  AAP-Maintenance 


begin  ApplyActiveMethod(i,  X,  Y,  Ain,  -4out)) 
while  not-empty(0  do 
begin 

method  =  head(?); 
if  Activejpgi-  (method,  X^Y) 

then  apply  met/iod(X,  y,  An,  Aut); 
I  =  tail(/) 

end 

end  ApplyActiveMethod 


Fig.  4.  AAP  propagation  procedure 


4  Anti-Functional  Reduction 

Anti-Functional  Reduction  (AFR)  is  a  simple  instance  of  AAP  that  aims  to 
reduce  the  number  of  AC  reduction  operations  during  backtracking  search.  The 
importance  of  reducing  propagation  operations  (e.g.  decrementing  a  support 
counter  in  AC-4)  arises  from  the  mechanisms  applied  by  the  backtracking  search 
process;  the  savings  are  not  primarily  due  to  the  time  saved  performing  a  basic 
instruction  such  as  subtraction.  Every  change  that  occurs  to  a  data  structure 
(such  as  a  support  counter)  in  a  propagation  sequence  must  be  recorded  to 
enable  restoration  of  state  on  backtracking.  The  act  of  recording  and  restoring 
state  consumes  time  and  space,  and  an  unnecessary  change  to  a  data  structure 
incurs  an  overhead  and  should  be  avoided.  In  the  worst  propagation  sequence 
AC-4  performs  ea^  support  decrement  propagation  operations,  where  e  is  the 
number  of  constraints  and  a  is  the  maximum  domain  size. 

For  a  single  constraint  arc,  FC  has  far  fewer  recordable  (restorable)  opera¬ 
tions  than  AC.  There  is  no  need  to  record  multiple  support  counter  decrements, 
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only  a  single  domain  reduction.  Section  2  examined  AC  and  FC  propagation  and 
described  how  it  is  possible  to  estimate  the  probability  of  achieving  hits  (reduc¬ 
tion  of  the  destination  variable)  using  AC-monitoring  but  not  in  FC-monitoring. 
However,  as  mentioned  earlier  the  instance  of  AAP  described  here  takes  the 
approach  that  any  possibility  of  a  domain  reduction  is  worth  pursuing  through 
the  application  of  AC-monitoring.  AFR  identifies  only  anti-functional  arcs  that 
guarantee  no  loss  in  propagation  when  their  monitoring  status  is  switched  from 
AC-monitoring  to  FC-monitoring,  thus  minimizing  the  amount  of  backtrack 
restoration.  Key  to  the  efficient  detection  of  anti-functional  arcs  in  AFR  is  the 
commonality  between  the  data  structures  of  AC-4,  the  AC  propagation  method 
deployed  in  AFR,  and  those  required  to  detect  anti-functionality;  the  support- 
complement  is  obtained  in  terms  of  the  support  and  the  domain  size  of  the  source 
variable  by  (1). 

As  well  as  switching  detected  anti-functional  arcs  to  FC-monitoring,  AFR 
activates  FC  instead  of  AC  for  reduction  sets  that  are  instantiations.  These  two 
enhancements  reduce  the  number  of  restorable  propagation  operations  in  the 
average  case,  albeit  with  a  small  meta-knowledge  maintenance  overhead.  The 
experimental  results  of  Sect.  5  show  that  using  FC  to  support  AC  in  this  way  is 
not  only  viable  but  entirely  justified,  especially  for  hard  problems. 


4.1  AFR  Initialization 

As  mentioned  above,  it  is  preferable  to  give  priority  to  FC  over  AC  when  propa¬ 
gating  reductions  that  are  instantiations  at  all  times.  A  sensible  initial  ordering 
of  applicable  methods  is  thus  {fc.ac}  because  it  forces  the  use  of  Forward  Check¬ 
ing  in  preference  to  Arc  Consistency  whenever  its  triggering  conditions  are  met 
(Active3|:^(fc,  X,y)  returns  true  on  instantiation).  Hence  the  function  InitListgfr^ 
defined  in  Fig.  5  assigns  {fc,ac}  as  the  initial  recommended  list  for  all  arcs  that 
are  not  anti-functional. 


begin  !nitList3fj.(X,  F) 

IXr  =  |A|  -  1; 

=  {yi\Sx{yi)<\Xn 

if  =  (j> 

return  {fc} 
else 

return  {fc,ac} 


Fig.  5.  AFR  method  list  initialization 
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4.2  AFR  Maintenance  and  Propagation 

The  task  of  the  revision  procedure  of  AFR  is  to  identify  when  arc  tags  may 
be  switched  from  mixed  FC-  AC-monitoring  ({fc,ac})  to  FC-monitoring  ({fc}). 
ReviseListgf^  revises  the  method  list  according  to  the  contents  of  a  shadow  domain 
Dy^  (Fig.  6).  Dy^  contains  all  those  values  that  are  not  handled  as  effectively 
by  FC-monitoring  of  X.  Note  that  Dy^  is  monotonically  reduced;  once  a  value 
has  been  removed  from  Dy^  there  is  never  a  cause  to  reintroduce  it  to  Dy^ 
in  remaining  sub-problems.  When  Dy^  is  empty  the  switch  can  occur.  Dy^ 
is  in  fact  a  data  structure  that  must  be  restored  on  backtracking,  however  it 
should  be  noted  that  any  value  that  is  not  a  member  of  Dy^  need  never  have  its 
support  counter  SxiVi)  decremented  by  the  AC-4  propagation  procedure.  The 
AC-4  procedure  is  focussed  only  on  reducing  the  support  counters  of  members 
of  Dy^,  resulting  in  a  net  reduction  in  the  number  of  restorable  propagation 
operations.  Completeness  of  propagation  is  restored  because  the  FC  procedure 
always  activates  on  instantiation,  determining  the  status  of  those  values  not  in 


begin  Revise Listg^:^  (A,  Y) 

\X\'  =  \X\-1; 

=  {j/i  :  Vi  e  A  Vi  €DyA  SxiVi)  <  IJCI'}: 

if  ^ 

return  {fc} 
else  return  {fc,ac} 

end 


Fig.  6.  AFR  method  list  maintenance 


5  Empirical  Results 

5.1  Experimental  Setup 

Fifty  thousand  experiments  were  used  to  compare  FC,  AC,  and  AFR.  A  fixed 
variable  ordering  was  employed  to  remove  extraneous  influences  on  algorithm 
performance.  The  experiments  were  conducted  in  groups  of  trials.  The  follow¬ 
ing  parameters  were  varied  over  groups:  no.  of  variables  (10,  20),  domain  size 
(5,  10),  and  Pdensity  (0.2,  0.4,  0.6,  0.8,  1.0  ).  In  addition,  structure  was  intro¬ 
duced  into  the  problem  constraints  with  two  tightness  parameters,  PappUcabiiity 
(0.1,0.3,0.5,0.7,0.9)  and  Pconstriction  (0.1,  0.3,  0.5,  0.7,  0.9).  The  three  probabili¬ 
ties  are  described  below: 
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-density 

Problem  variable  pairs  (X,  Y)  have  a  constraint  defined  on  them  with  prob¬ 
ability  Pdensity. 

~  Pappiicabiiity  This  parameter  is  used  to  control  the  proportion  of  values  in 
variable  domains  to  which  constraints  apply.  For  each  constraint  canstrxY^ 
two  applicability  sets  {Appx  and  Appy)  are  created.  These  contain  the  values 
in  the  domains  of  X  and  Y  that  the  constraint  applies  to.  A  value  is  not  a 
member  of  its  variable’s  applicability  set  with  probability  PappUcabiuty 
Low  values  for  Pappiicabiuty  imply  tight  constraints. 

~  Pconstriction 

P constriction  controls  tightness  of  a  constraint  constrxy  given  that  it  applies 
only  to  those  values  in  Appx  and  Appy.  Two  tuple  sets,  Tupx  and  Tupy, 
are  created.  The  intersection  of  Tupx  and  Tupy  yields  the  tuples  of  the 
constraint.  Each  pair  constructed  by  matching  a  value  from  Appx  and  any 
value  from  dom{Y)  is  a  member  of  Tupx  with  probability  Pconstriction-  In 
addition,  pairs  constructed  by  matching  values  not  in  Appx  to  any  value 
from  dom{Y)  are  members  of  Tupx-  Tupy  is  constructed  symmetrically. 
Hence  low  values  for  Pconstriction  imply  tight  constraints. 

P ippiicabiiity  and  Pconstriction  are  more  complex  than  the  conventional  tight¬ 
ness  probability  often  applied  in  experimental  studies.  They  were  used  instead 
because  they  enable  the  configuration  of  constraints  to  be  close  to  -  or  far  from 
-  anti-functionality.  For  example,  binary  constraints  on  variables  with  a  do¬ 
main  of  size  10  would  tend  to  be  anti-functional  when  PappUcabiiity  =  0  and 
Pconstriction  =  0.9.  Arguably  these  two  parameters  are  a  fairer  representation  of 
real  world  constraints  since  they  are  more  structured  and  allow  constraints  to 
be  relevant  to  only  subsets  of  the  values  in  their  variables’  domains,  rather  than 
the  more  uniform  distribution  over  tuples  created  by  the  conventional  tightness 
parameter. 

Each  possible  combination  of  problem  parameters  was  tested  100  times.  All 
trials  were  conducted  on  a  Sun  Sparcstation  20  with  100  CPU  seconds  timeout 
per  trial.  Almost  all  timeouts  were  caused  by  FC  solving  relatively  unconstrained 
problems  (cf.  exceptionally  hard  problems  [12]).  Timeouts  were  not  penalized. 
The  algorithms  were  implemented  in  the  ECL'PS®  Constraint  Logic  Program¬ 
ming  environment  [3]. 


5.2  Results 

Figures  7-11  compare  AC,  FC  and  AFR  in  terms  of  average  execution  time  in 
CPU  seconds.  AFR  shows  the  smallest  increase  with  increasing  domain  size  and 
number  of  variables  (Figs.  7,  8).  Figure  9  demonstrates  that  AFR  significantly 
improves  AC  performance  across  all  densities,  and  is  generally  better  than  FC 
for  all  but  the  highest  value.  The  graphs  for  Pappiicabiiity  and  Pconstriction  show 
again  that  AFR  is  superior  to  AC,  and  is  better  than  or  close  to  FC  performance. 

Finally,  Fig.  12  shows  that  AFR  achieves  its  best  improvements  over  AC 
when  solving  hard  problems  with  a  large  number  of  backtracks.  Each  point 
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Fig.  7.  Average  time  (25,000  trials)  versus  no.  of  variables 


Dorn  10 


Fig.  8.  Average  time  (25,000  trials)  versus  domain  size 


marked  on  the  graph  represents  the  average  of  100  trials  or  more  in  the  case  of 
overlapping  points.  The  AFR  improvement  was  measured  both  in  the  average 
absolute  time  and  in  the  average  number  of  restorable  operations;  the  figure 
demonstrates  the  close  relationship  between  the  two  and  shows  that  AFR  reduces 
AC  timings  by  up  to  60%  for  hard  problems. 

6  Conclusion 

6.1  Summary 

Adaptive  propagation  is  not  only  viable  but  an  effective  means  for  improv¬ 
ing  performance.  Propagation  may  be  adapted  usefully  during  backtracking 
search  according  to  dynamically  changing  search  parameters  and  problem  meta¬ 
knowledge.  An  adaptive  propagation  algorithm  switches  between  a  number  of 


176 


0.2  0.4  0.6  0.8  1 

Density 


Fig.  9.  Average  time  (10,000  trials)  versus  Pdensity 
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Fig.  10.  Average  time  (10,000  trials)  versus  Papplicability 

propagation  methods  depending  on  these  monitored  parameters.  In  this  paper 
we  focussed  on  two  arc  propagation  methods,  AC  and  FC.  The  adaptive  propaga¬ 
tion  scheme  AAP  was  defined  to  allow  the  flexible  interchange  between  these  two 
arc  propagation  methods.  A  particular  instance  of  this  scheme,  AFR,  reduced 
the  scope  of  AC  propagation  to  exclude  arcs  that  were  handled  as  effectively  by 
FC.  Experimental  evidence  confirmed  that  AFR  performed  better  than  a  pure 
AC  maintenance  algorithm,  and  was  especially  effective  for  hard  problems. 


6.2  Future  Work 

Another  instance  of  AAP  is  currently  being  refined  to  reduce  redundant  propa¬ 
gation  on  repair.  Repair  involves  the  reassignment  of  variables  from  old  values 
to  new  ones  in  their  domain.  Assuming  that  old  values  are  mutually  consistent, 
variables  that  include  their  old  values  may  remain  unchanged;  constraint  arcs 
from  such  variables  may  remain  inactive  until  an  update  in  the  form  of  a  vari¬ 
able  reassignment  or  domain  reduction  deletes  the  old  value  from  their  domain. 
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Fig.  11.  Average  time  (10,000  trials)  versus  Pconstriction 


Avg.  no  of  AC  Backtracks  (at  least  100  trials  at  each 
point) 


Fig.  12.  Percentage  improvement  in  time/operations  versus  average  backtracks 


Arcs  from  such  variables  then  propagate  onto  their  destination  variables  possi¬ 
bly  causing  other  variables  and  arcs  to  “wake  up”.  This  is  another  example  of 
propagation  realigning  to  reflect  the  dynamics  of  search,  in  this  case  the  search 
for  consistent  repairs  to  an  existing  solution  [4]. 
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Abstract.  The  constraint  satisfaction  community  has  developed  a  number 
of  heuristics  for  variable  ordering  during  backtracking  search.  For  example, 
in  conjunction  with  algorithms  which  check  forwards,  the  Fail-First  (FF) 
and  Brelaz  (Bz)  heuristics  are  cheap  to  evaluate  and  are  generally  consid¬ 
ered  to  be  very  effective.  Recent  work  to  understand  phase  transitions  in 
NP-complete  problem  classes  enables  us  to  compare  such  heuristics  over  a 
large  range  of  different  kinds  of  problems.  Furthermore,  we  are  now  able 
to  start  to  understand  the  reasons  for  the  success,  and  therefore  also  the 
failnre,  of  heuristics,  and  to  introduce  new  heuristics  which  achieve  the  suc¬ 
cesses  and  avoid  the  failures.  In  this  paper,  we  present  a  comparison  of  the 
Bz  and  FF  heuristics  in  forward  checking  algorithms  applied  to  randomly- 
generated  binary  CSP’s.  We  also  introduce  new  and  very  general  heuristics 
and  present  an  extensive  study  of  these.  These  new  heuristics  are  usually 
as  good  as  or  better  than  Bz  and  FF,  and  we  identify  problem  classes 
where  our  new  heuristics  can  be  orders  of  magnitude  better.  The  result  is 
a  deeper  understanding  of  what  helps  heuristics  to  succeed  or  fail  on  hard 
random  problems  in  the  context  of  forward  checking,  and  the  identification 
of  promising  new  heuristics  worthy  of  further  investigation. 


1  Introduction 

In  the  constraint  satisfaction  problem  (CSP)  we  are  to  assign  values  to  variables 
such  that  a  set  of  constraints  is  satisfied,  or  show  that  no  satisfying  assignment 
exists.  This  may  be  done  via  a  systematic  search  process,  such  as  depth  first 
search  with  backtracking,  and  this  amounts  to  a  sequence  of  decisions,  where  a 
decision  is  a  choice  of  variable  and  value  to  assign  to  that  variable.  The  order 
in  which  decisions  are  made  can  have  a  profound  effect  on  search  effort.  Dechter 
and  Meiri’s  study  of  preprocessing  techniques  [3]  shows  that  dynamic  search  re¬ 
arrangement  (DSR),  i.e,  a  variable  ordering  heuristic  that  selects  as  next  variable 

*  This  research  was  supported  by  HCM  personal  fellowship  to  the  last  author,  by  a 
University  of  Strathclyde  starter  grant  to  the  first  author,  and  by  an  EPSRC  ROPA 
award  GR/K/65706  for  the  first  three  authors.  Authors  listed  alphabetically.  We 
thank  the  other  members  of  the  APES  group,  and  our  reviewers,  for  their  comments. 
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the  one  that  has  minimal  number  of  values  in  its  domain,  dominated  all  other 
static  orderings.  Here,  we  present  three  new  dynamic  variable  ordering  (dvo) 
heuristics,  derived  as  a  result  of  our  studies  of  phase  transition  phenomena  of 
combinatorial  problems,  and  compare  these  against  two  existing  heuristics. 

Tsang,  Borrett,  and  Kwan’s  study  of  CSP  algorithms  [22]  shows  that  there 
does  not  appear  to  be  a  universally  best  algorithm,  and  that  certain  algorithms 
may  be  preferred  under  certain  circumstances.  We  carry  out  a  similar  investi¬ 
gation  with  respect  to  dvo  heuristics  in  an  attempt  to  determine  under  what 
conditions  one  heuristic  dominates  another. 

In  the  next  section  we  give  a  background  to  the  study.  We  then  go  on  to 
describe  four  measures  of  the  constrainedness  of  CSP’s,  and  in  Section  4  describe 
five  heuristics,  based  on  these  measures.  The  empirical  study  is  reported  in 
Section  5,  the  heuristics  are  then  discussed  with  respect  to  previous  work  in 
Section  6,  and  conclusions  are  drawn  in  Section  7. 


2  Background 

A  constraint  satisfaction  problem  consists  of  a  set  of  n  variables  V,  each  variable 
V  eV  having  a  domain  of  values  My  of  size  ruy ,  and  a  set  of  constraints  C.  Each 
constraint  c  G  C  of  arity  a  restricts  a  tuple  of  variables  (vi, . . . ,  Va),  and  specifies 
a  subset  of  Mi  x  M2  x  . . .  x  Ma ,  each  element  of  which  is  a  combination  of  values 
the  variables  are  forbidden  to  take  simultaneously  by  this  constraint.  In  a  binary 
CSP,  which  the  experiments  reported  here  are  exclusively  concerned  with,  the 
constraints  are  all  of  arity  2.  A  solution  to  a  CSP  is  an  assignment  of  a  value  to 
every  variable  satisfying  all  the  constraints.  The  problem  that  we  address  here 
is  the  decision  problem,  i.e.  finding  one  solution  or  showing  that  none  exists. 

There  are  two  classes  of  complete  search  algorithm  for  the  CSP,  namely 
those  that  check  backwards  and  those  that  check  forwards.  In  algorithms  that 
checks  backwards,  the  current  variable  Vi  is  instantiated  and  checking  takes  place 
against  the  (past)  instantiated  variables.  If  this  is  inconsistent  then  a  new  value 
is  tried,  and  if  no  values  remain  then  a  past  variable  is  reinstantiated.  In  al¬ 
gorithms  that  check  forwards,  the  current  variable  is  instantiated  with  a  value 
and  the  (future)  uninstantiated  variables  are  made  consistent,  to  some  degree, 
with  respect  to  that  instantiation.  Chronological  backtracking  (BT),  backmark- 
ing  (BM),  backjumping  (BJ),  conflict-directed  backjumping  (CBJ),  and  dynamic 
backtracking  (DB)  are  algorithms  that  check  backwards  [11,  5,  6,  10],  whereas 
forward  checking  (FC)  and  maintaining  arc-consistency  (MAC)  are  algorithms 
that  check  forwards  [13,  18].  This  study  investigates  only  forward  checking  al¬ 
gorithms,  and  in  particular  forward  checking  combined  with  conflict-directed 
backjumping  (FC-CBJ)  [15]. 

Algorithm  FC  instantiates  variable  vi  with  a  value  Xi  and  removes  from  the 
domains  of  future  variables  any  values  that  are  inconsistent  with  respect  to  that 
instantiation.  If  the  instantiation  results  in  no  values  remaining  in  the  domain 
of  a  future  variable,  then  a  new  value  is  tried  for  Vi  and  if  no  values  remain  for 
Vi  (i.e.  a  dead  end  is  reached)  then  the  previous  variable  is  reinstantiated  (i.e. 
chronological  backtracking  takes  place).  FC-CBJ  differs  from  FC;  on  reaching  a 
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dead  end  the  algorithm  jumps  back  to  a  variable  that  is  involved  in  a  conflict 
with  the  current  variable  [15]. 

In  selecting  an  algorithm  we  will  prefer  one  that  takes  less  search  effort 
than  another,  where  search  effort  is  measured  the  number  of  times  pairs  of 
values  are  compared  for  compatibility,  i.e.  consistency  checks.  Generally,  checking 
forwards  reduces  search  effort,  as  does  jumping  back. 

The  order  in  which  variables  are  chosen  for  instantiation  profoundly  influ¬ 
ences  search  effort.  Algorithms  that  check  backwards  tend  to  use  variable  order¬ 
ing  heuristics  that  exploit  topological  parameters,  such  as  width,  induced  width 
or  bandwidth,  and  correspond  to  static  instantiation  orders  (i.e.  they  do  not 
change  during  search)  [21].  Algorithms  that  check  forwards  have  additional  in¬ 
formation  at  their  disposal,  such  as  the  current  size  of  the  domains  of  variables. 
Furthermore,  since  domain  sizes  may  vary  during  the  search  process,  forward 
checking  algorithms  may  use  dynamic  variable  ordering  (dvo)  heuristics  [17], 
and  it  is  this  class  of  heuristics  that  is  investigated  here. 

3  Constrainedness 

Many  NP-complete  problems  display  a  transition  in  solubility  as  we  increase  the 
constrainedness  of  problem  instances.  This  phase  transition  is  associated  with 
problems  which  are  typically  hard  to  solve  [2].  Under- constrained  problems  tend 
to  have  many  solutions  and  it  is  usually  easy  to  guess  one.  Over-constrained 
problems  tend  not  to  have  solutions,  and  it  usually  easy  to  rule  out  all  possible 
solutions.  A  phase  transition  occurs  in  between  when  problems  are  “critically 
constrained”.  Such  problems  are  usually  difficult  to  solve  cis  they  are  neither 
obviously  soluble  or  insoluble.  Problems  from  the  phase  transition  are  often 
used  to  benchmark  CSP  and  satisfiability  procedures  [22,  9].  Constrainedness 
can  be  used  both  to  predict  the  position  of  a  phcise  transition  in  solubility  [23, 
20,  16,  7,  19]  and,  as  we  show  later,  to  motivate  the  construction  of  heuristics. 

In  this  section,  we  identify  four  measures  of  some  aspect  of  constrainedness. 
These  measures  all  apply  to  an  ensemble  of  random  problems.  Such  measures 
may  suggest  whether  an  individual  problem  from  the  ensemble  is  likely  to  be 
soluble.  For  example,  a  problem  with  larger  domain  sizes  or  looser  constraints 
is  more  likely  to  be  soluble  than  a  problem  with  smaller  domains  or  tighter  con¬ 
straints,  all  else  being  equal.  To  make  computing  such  measures  tractable,  we  will 
ignore  specific  features  of  problems  (like  the  topology  of  the  constraint  graph) 
and  consider  just  simple  properties  like  domain  sizes  and  constraint  tightness. 

One  simple  measure  of  constrainedness  can  be  derived  from  the  size  of  prob¬ 
lems  in  the  ensemble.  Size  is  determined  by  both  the  number  of  variables  and 
their  domain  sizes.  Following  [7,  8],  we  measure  problem  size  via  the  size  of  the 
state  space  being  explored.  This  consists  of  all  possible  assignments  of  values  to 
variables,  its  size  is  simply  the  product  of  the  domain  sizes,  Ylv^v  define 

the  size  (A7)  of  the  problem  as  the  number  of  bits  needed  the  number  of  bits 
needed  to  describe  a  point  in  the  state  space,  so  we  have: 

-def 

veV 


(1) 
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A  large  problem  is  likely  to  be  less  constrained  and  has  a  greater  chance  of  being 
soluble  than  a  small  problem  with  the  same  number  of  variables  and  constraints 
of  the  same  tightnesses. 

A  second  measure  of  constrainedness  is  the  solution  density  of  the  ensemble. 
If  the  constraint  c  on  average  rules  out  a  fraction  pc  of  possible  assignments, 
then  a  fraction  1  —  pc  of  assignments  are  allowed.  The  average  solution  density, 
p  is  the  mean  fraction  of  assignments  allowed  by  all  the  constraints.  The  mean 
solution  density  over  the  ensemble  is, 

(2) 

cGC 

Problems  with  loose  constraints  have  high  solution  density.  As  noted  above,  all 
else  being  equal,  a  problem  with  a  high  solution  density  is  more  likely  to  be 
soluble  than  a  problem  with  a  low  solution  density. 

A  third  measure  of  constrainedness  is  derived  from  the  size  and  solution  den¬ 
sity.  E{N)^  the  expected  number  of  solutions  for  a  problem  within  an  ensemble 
is  simply  the  size  of  the  state  space  times  the  probability  that  a  given  element 
in  the  state  space  is  a  solution.  That  is, 

EiN)  =  =  n  X  11(1  -  (3) 

v^V  c£C 

If  problems  in  an  ensemble  are  expected  to  have  a  large  number  of  solutions,  then 
an  individual  problem  within  the  ensemble  is  likely  to  be  loosely  constrained  and 
to  have  many  solutions. 

The  fourth  and  final  measure  of  constrainedness,  k  is  again  derived  from 
the  size  and  solution  density.  This  has  been  suggested  as  a  general  measure 
of  the  “constrainedness”  of  combinatorial  problems  [8].  It  is  motivated  by  the 
randomness  with  which  we  can  set  a  bit  in  a  solution  to  a  combinatorial  problem. 
If  K.  is  small,  then  problems  typically  have  many  solutions  and  a  given  bit  can 
be  set  more  or  less  at  random.  For  large  /c,  problems  typically  have  few  or  no 
solutions  and  a  given  bit  is  very  constrained  in  how  it  can  be  set.  k  is  defined 

by, 


_  log2(S(Ar)) 

-def  1  ^ 

(4) 

log2(p) 

N 

-EcecMl -Pc) 

E„evlog(”»c) 

(5) 

If  K  <C  1  then  problems  have  a  large  expected  number  of  solutions  for  their 
size.  They  are  therefore  likely  to  be  under-constrained  and  soluble.  If  /c  >  1 
then  problems  have  a  small  expected  number  of  solutions  for  their  size.  They 
are  therefore  likely  to  be  over- constrained  and  insoluble.  A  phase  transition  in 
solubility  occurs  inbetween  where  /c  «  1  [8].  This  is  equivalent  for  CSPs  to  the 
prediction  made  in  [19]  that  a  phase  transition  occurs  when  E{^N)  ^  1. 
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4  Heuristics  for  Constrainedness 

Many  heuristics  in  CSPs  branch  on  what  can  often  be  seen  as  an  estimate  of 
the  most  constrained  variable  [8].  Here,  we  describe  two  well  known  heuristics 
for  CSPs  and  three  new  heuristics.  We  use  the  four  measures  of  constrained¬ 
ness  described  above.  These  measures  were  defined  for  an  ensemble  of  problems. 
Each  measure  can  be  computed  for  an  individual  problem,  but  will  give  only 
an  estimate  for  the  constrainedness  of  an  individual  problem.  For  example,  an 
insoluble  problem  has  zero  solution  density  and  this  may  be  very  different  from 
the  measured  value  of  p.  Even  so,  such  measures  can  provide  both  a  good  indi¬ 
cation  of  the  probability  of  a  solution  existing  and,  as  we  show  here,  a  heuristic 
estimate  of  the  most  constrained  variable. 

Below,  we  adopt  the  following  conventions.  When  a  variable  Vi  is  selected  as 
the  current  variable  and  instantiated  with  a  value,  vi  is  removed  from  the  set  of 
variables  V ^  constraint  propagation  takes  place,  and  all  constraints  incident  on 
Vi^  namely  C*,  are  removed  from  the  set  of  constraints  C.  Therefore  V  is  the  set 
of  future  variables,  C  is  the  set  of  future  constraints,  rrij  is  the  actual  size  of  the 
domain  of  Vj  6  V  after  constraint  propagation,  pc  is  the  actual  value  of  constraint 
tightness  for  constraint  c  £C  after  constraint  propagation,  and  Cj  is  the  set  of 
future  constraints  incident  on  vj .  All  characteristics  of  the  future  subproblem 
are  recomputed  and  made  available  to  the  heuristics  as  local  information. 

4.1  Heuristic  FF 

Haralick  and  Elliott  [13]  proposed  the  fail-first  principle  for  CSPs  as  follows:  “To 
succeed,  try  first  where  you  are  most  likely  to  fail.  ”  The  reason  for  attempting 
next  the  task  which  is  most  likely  to  fail  is  to  encounter  dead-ends  early  on  and 
prune  the  search  space.  Applied  as  a  constraint  ordering  heuristic  this  suggests 
that  we  check  first  the  constraints  that  are  most  likely  to  fail  and  when  applied 
as  a  variable  ordering  heuristic,  that  we  choose  the  most  constrained  variable. 
An  estimate  for  the  most  constrained  variable  is  the  variable  with  the  smallest 
domain.  That  is  we  choose  Vi  €  V  such  that  rui  is  a  minimum. 

An  alternative  interpretation  of  this  heuristic  is  to  branch  on  Vi  such  that  we 
maximize  the  size  of  the  resulting  subproblem,  without  considering  the  constraint 
information  on  that  variable.  That  is,  choose  the  variable  Vi  £V  that  maximizes 

^  log(mv)  (6) 

where  V  —  Vi  is  the  set  of  future  variables  with  Vi  removed,  and  is  the  same  as 
selecting  the  variable  v*  which  maximizes  the  denominator  of  equation  (5). 

4.2  Heuristic  Bz 

The  Brelaz  heuristic  (Bz)  comes  from  graph  colouring  [1];  we  wish  to  find  a 
colouring  of  the  vertices  of  a  graph  such  that  adjacent  vertices  have  different 
colours.  Given  a  partial  colouring  of  a  graph,  the  saturation  of  a  vertex  is  the 
number  of  differently  coloured  vertices  adjacent  to  it.  A  vertex  with  high  satura¬ 
tion  will  have  few  colours  available  to  it.  The  Bz  heuristic  first  colours  a  vertex  of 
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maximum  degree.  Thereafter  Bz  selects  an  uncoloured  vertex  of  maximum  sat¬ 
uration,  tie-breaking  on  the  degree  in  the  uncoloured  subgraph.  Bz  thus  chooses 
to  colour  next  what  is  estimated  to  be  the  most  constrained  vertices. 

When  applying  Bz  to  a  CSP  we  choose  the  variable  with  smallest  domain  size 
and  tie-break  on  degree  in  the  future  subproblem.  That  is,  choose  the  variable 
with  smallest  rrii  and  tie-break  on  the  variable  with  greatest  future  degree  |C*  |.  In 
a  fully  connected  constraint  graph,  Bz  will  behave  like  FF,  because  all  variables 
have  the  same  degree. 

4.3  Heuristic  Rho 

The  Rho  (/?)  heuristic  branches  into  the  subproblem  that  maximizes  the  solution 
density,  p.  The  intuition  is  to  branch  into  the  subproblem  where  the  greatest 
fraction  of  states  are  expected  to  be  solutions.  To  maximize  p,  we  select  the 
variable  Vi  G  V  that  maximizes 

n  (7) 

c£C-Ci 

where  C  —  Ci  is  the  set  of  future  constraints  that  do  not  involve  variable  Vi ,  and 
(1  —pc)  is  the  looseness  of  a  constraint.  If  we  express  (7)  as  a  sum  of  logarithms, 
X^cec-Ci  then  this  corresponds  to  selecting  a  variable  that  minimizes 

the  numerator  of  (5),  Expression  (7)  gives  an  estimate  of  the  solution  density 
of  the  subproblem  after  selecting  Vi .  More  concisely  (and  more  computationally 
efficient),  we  choose  the  future  variable  Vi  that  minimizes 

n(i-p<=)  (8) 

cec. 

This  is  the  variable  with  the  most  and/or  tightest  constraints.  Again,  we  branch 
on  an  estimate  of  the  most  constrained  variable. 

4.4  Heuristic  E(N) 

The  E(N)  heuristic  branches  into  the  subproblem  that  maximizes  the  expected 
number  of  solutions,  E{N).  This  will  tend  to  maximize  both  the  subproblem 
size  (the  FF  heuristic)  and  its  solution  density  (the  Rho  heuristic).  Therefore, 
we  select  a  variable  Vi  G  V  that  maximizes 

P[  X  n  (1-Pc)  (9) 

v^V~Vi  c^C—Ci 

where  V  —  Vi  is  the  set  of  future  variables  with  Vi  removed,  and  C  —  Ci  is  the  set 
of  future  constraints  that  do  not  involve  variable  i;*.  This  can  be  more  succinctly 
(and  efficiently)  expressed  as  choose  the  variable  Vi  eV  that  minimizes 

miJJ(l-Pe)  (10) 

The  E(N)  heuristic  has  an  alternative,  intuitively  appealing,  justification. 
Let  N  be  the  number  of  solutions  to  the  current  subproblem.  At  the  root  of 
the  tree,  N  is  the  total  number  of  solutions  to  the  problem.  If  N=0,  the  current 
subproblems  has  no  solutions,  and  the  algorithm  will  at  some  point  backtrack. 
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If  N=l,  the  current  subproblem  has  exactly  one  solution,  and  N  will  remain 
constant  on  the  path  leading  to  this  solution,  but  be  zero  everywhere  else.  As  we 
move  down  the  search  tree,  N  cannot  increase  as  we  instantiate  variables.  The 
obvious  heuristic  is  to  maximize  N  in  the  future  subproblem.  We  use  E(N)  as 
an  estimate  for  N,  so  we  branch  into  the  subproblem  that  maximizes  E(N).  And 
this  is  again  an  estimate  for  the  most  constrained  variable,  as  loosely  constrained 
variables  will  tend  to  reduce  N  most.  Consider  a  loosely  constrained  variable  Vi 
that  can  take  any  value  in  its  domain.  Branching  on  this  variable  will  reduce  N 
to  N/m*.  Tightly  constrained  variables  will  not  reduce  N  as  much. 


4.5  Heuristic  Kappa 

The  Kappa  heuristic  branches  into  the  subproblem  that  minimizes  k.  Therefore, 
select  a  variable  Vi  C  V  that  minimizes 


-EcgC-Cil°S(l-Pc) 

Evev'-vi 


Let  a  be  the  numerator  and  ^  be  the  denominator  of  equation  (5),  the  defi¬ 
nition  of  K.  That  is,  a  =  —  ~  Pc)  ^  —  J2vev  Then  we 

select  a  variable  Vi  G  V  such  that  we  maximize  the  following 


a  +  EceCjMl-Pc) 
P  -  log(TO,  ) 


(12) 


This  heuristic  was  first  suggested  in  [8]  but  has  not  yet  been  tested  extensively 
on  a  range  of  CSPs,  and  depends  on  the  proposal  in  [8]  that  k  captures  a  notion 
of  the  constrainedness  of  an  ensemble  of  problems.  We  assume  that  k  provides 
an  estimate  for  the  constrainedness  of  an  individual  in  that  ensemble.  We  again 
want  to  branch  on  a  variable  that  is  estimated  to  be  the  most  constrained, 
giving  the  least  constrained  subproblem.  We  estimate  this  by  the  subproblem 
with  smallest  «.  This  suggests  the  heuristic  of  minimizing  «. 


4.6  Implementing  the  heuristics 

We  use  all  the  above  heuristics  with  the  forward  checking  algorithm  FC-CBJ. 
After  the  current  variable  has  successfully  been  assigned  a  value  (i.e.  after  domain 
filtering  all  future  variables  have  non-empty  domains),  the  constraint  tightness 
is  recomputed  for  any  constraint  acting  between  a  pair  of  variables,  vj  and  Vk, 
such  that  values  have  just  been  removed  from  the  domain  of  vj  or  Vk,  or  both.  To 
compute  constraint  tightness  pc  for  constraint  c  acting  between  variables  vj  and 
Vk  we  count  the  number  of  conflicting  pairs  across  that  constraint  and  divide  by 
the  product  of  the  new  domain  sizes.  This  counting  may  be  done  via  consistency 
checking  and  will  take  mj  x  checks.  Constraint  tightness  will  then  be  in  the 
range  0  (all  pairs  compatible)  to  1  (all  pairs  are  conflicts).  When  computing  the 
sum  of  the  log  looseness  of  constraints  (i.e.  the  numerator  of  equation  (5)),  if 
p,.  =  1  a  value  of  —  oo  is  returned.  Consequently,  the  Kappa  heuristic  will  select 
variable  vj  or  Vk  next,  and  the  instantiation  will  result  in  a  dead  end. 
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In  the  FF  heuristic  the  first  variable  selected  is  the  variable  with  smallest 
domain  size,  and  when  all  variables  have  the  same  domain  size  we  select  first  the 
lowest  indexed  variable  Vi,  For  the  Bz  heuristic  saturation  is  measured  as  the 
inverse  of  the  domain  size;  i.e.  the  variable  with  smallest  domain  size  will  have 
largest  saturation.  Consequently,  when  the  constraint  graph  is  a  clique  FF  and 
Bz  will  have  identical  behaviours. 

Search  costs  reported  in  this  paper  do  not  include  the  cost  in  terms  of  con¬ 
sistency  checks  of  recomputing  the  constraint  tightness.  This  overhead  makes 
some  of  the  heuristics  less  competitive  than  our  results  might  suggest.  However, 
our  main  concern  here  is  to  establish  sound  and  general  principles  for  selecting 
variable  ordering  heuristics.  In  the  future,  we  hope  to  develop  book-keeping  tech¬ 
niques  and  approximations  to  the  heuristics  that  reduce  the  cost  of  re-computing 
or  estimating  the  constraint  tightness  but  which  still  give  good  performance. 

5  The  Experiments 

The  experiments  attempt  to  identify  under  what  conditions  one  heuristic  is  bet¬ 
ter  than  another.  Initially,  experiments  are  performed  over  uniform  randomly 
generated  CSP.  In  a  problem  {n,m,pi,p2)  there  will  be  n  variables,  with  a  uni¬ 
form  domain  of  size  m,  -  -  constraints,  and  exactly  P2rn^  conflicts  over 

each  constraint  [16,  19].  This  class  of  problem  is  then  modified  such  that  we 
investigate  problems  with  non-uniform  domains  and  constraint  tightness. 

When  plotting  the  results,  problems  will  be  measured  in  terms  of  their  con¬ 
strainedness,  K.  This  is  because  in  some  experiments  we  vary  the  number  of 
variables  and  keep  the  degree  of  variables  7  constant,  vary  the  tightness  of  con¬ 
straints  p2j  and  so  on.  By  using  constrainedness  we  hope  to  get  a  clear  picture  of 
what  happens.  Furthermore,  in  non-uniform  problems  constrainedness  appears 
to  be  one  of  the  few  measures  that  we  can  use.  It  should  be  noted  that  in  the 
experiments  the  complexity  peak  does  not  always  occur  exactly  at  /c  =  1,  and 
that  in  sparse  constraints  graphs  the  peak  tends  to  occur  at  lower  values  of  «, 
typically  in  the  range  0.6  to  0.9.  This  has  been  observed  empirically  in  [16],  and 
an  explanation  is  given  by  Smith  and  Dyer  [19]. 

In  all  of  the  graphs  we  have  kept  the  same  line  style  for  each  of  the  heuristics. 
The  labels  in  the  graphs  have  then  been  ordered,  from  top  to  bottom,  to  corre¬ 
spond  to  the  ranking  of  the  heuristics  in  the  phase  transition.  The  best  heuristic 
will  thus  appear  first. 

5.1  Uniform  Problems,  Varying  Constraint  Graph  Density 

The  aim  of  this  experiment  is  to  determine  how  the  heuristics  are  affected  as 
we  vary  the  number  of  constraints  within  the  constraint  graph.  The  experiments 
were  performed  over  problems  with  20  variables,  each  with  a  domain  size  of  10. 
In  Figure  1,  we  plot  the  mean  performance  for  sparse  constraint  graphs^  with 
Pi  =  0.2,  maximally  dense  constraint  graphs  with  pi  =  1.0  and  constraint  graphs 
of  intermediate  density  p\  =  0.5.  At  each  density  1,000  problems  were  generated 
at  each  possible  value  of  p2  from  0.01  to  0.99  in  steps  of  0.01. 

^  Disconnected  graphs  were  not  filtered  out  since  they  had  little  effect  on  performance. 
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Fig.  1.  Mean  performance  of  heuristics  for  (20, 10) 


For  sparse  constraint  graphs  (see  Figure  1(a)),  Bz  performs  best,  whilst  E(N) 
and  Kappa  are  not  far  behind.  Rho  is  significantly  worse  and  FF  even  more  so. 
Analysing  the  distribution  in  performance  (graphs  are  not  shown)  e.g.  the  me* 
dian,  95%  and  higher  percentiles,  we  observed  a  similar  ranking  of  the  heuristics 
with  the  differences  between  the  heuristics  opening  up  in  the  higher  percentiles  in 
the  middle  of  the  phase  transition.  As  problems  become  more  dense  at  pi  =  0.5 
(see  Figure  1(b))  Kappa  dominates  E(N).  Rho  and  FF  continue  to  perform 
poorly,  although  FF  does  manage  to  overtake  Rho. 

For  complete  graphs  with  pi  =  1.0  (see  Figure  1(c)),  Bz  and  FF  are  identical, 
as  expected.  (The  contour  for  FF  overwrites  the  Bz  contour.)  For  uniform  and 
sparse  problems,  Bz  seemed  to  be  best,  whilst  for  uniform  and  dense  problems. 
Kappa  or  E(N)  would  seem  to  be  best. 

For  comparison  with  the  dynamic  variable  ordering  heuristics,  in  Figure  1(d) 
we  also  plot  the  mean  performance  of  FC-CBJ  with  a  static  variable  ordering: 
variables  were  considered  in  lexicographic  order.  Performance  is  much  worse 
with  a  static  ordering  than  with  any  of  the  dynamic  ordering  heuristics,  even  on 
the  relatively  easy  sparse  constraint  graphs.  The  secondary  peaks  for  the  static 
variable  ordering  at  low  k  occur  as  a  result  of  ehps  [20],  occasional  “exceptionally 
hard”  problems  that  arise  following  poor  branching  decisions  early  in  search  [9]. 
The  worst  case  outside  the  phase  transition  was  more  than  14  million  checks  at 
K  =  0.46,  in  a  region  where  100%  of  problems  were  soluble.  This  was  5  orders  of 
magnitude  worse  than  the  median  of  288  checks  at  this  point. 
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5.2  Uniform  Problems,  Varying  Number  of  Variables  n 


(c)  Peaks  of  mean  performance  over  n 
Fig,  2.  Mean  performance  for  FC-CB  J  +  heuristics  for  (n,  10)  with  7  =  5 


The  aim  of  this  experiment  is  to  determine  how  the  heuristics  scale  with 
problem  size.  At  first  sight,  this  can  be  simply  done  by  increasing  the  number 
of  variables  n,  while  keeping  all  else  constant.  However,  if  n  increases  while  pi  is 
kept  constant  the  degree  7  of  a  variable  (i.e.  the  number  of  constraints  incident 
on  a  variable)  also  increases.  To  avoid  this,  we  vary  pi  with  n  such  that  average 
degree  7  remains  constant  at  5,  similar  to  [12].  To  observe  a  phase  transition, 
1,000  problems  were  then  generated  at  each  possible  value  of  p2  from  0.01  to 
0.99  in  steps  of  0.01. 

In  Figure  2,  we  plot  the  performance  of  each  heuristic  as  we  increase  n.  In 
Figures  2(a)  and  (b),  we  show  the  mean  performance  for  n  =  30  and  n  =  50 
respectively.  The  ranking  of  the  heuristics  remains  the  same  as  in  the  previous 
experiment  for  constraint  graphs  of  intermediate  density.  Though  not  shown,  we 
observed  similar  behaviour  in  the  distribution  of  performance  (e.g.  median,  95% 
and  higher  percentiles).  As  before,  the  differences  between  the  heuristics  tend  to 
open  up  in  the  higher  percentiles  in  the  middle  of  the  phase  transition. 
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In  Figure  2(c)  we  plot  the  peak  in  average  search  elFort  in  the  phase  transition 
region  for  each  value  of  n.  This  then  gives  a  contour  showing  how  search  cost 
increases  with  n,  for  this  class  of  problem.  The  Figure  suggests  that  Bz,  Kappa 
and  E(N)  scale  in  a  similar  manner.  Using  a  least  square  linear  fit  on  the  limited 
data  available,  we  conjecture  that  E(N)  would  become  better  than  Bz  when 
n  >  90,  and  Kappa  would  do  likewise  when  n  >  164.  Further  empirical  studies 
on  larger  problems  would  be  needed  to  confirm  this.  However,  Rho  and  FF 
appear  to  scale  less  well.  The  gradients  of  Figure  2(c)  suggests  that  FF  and  Rho 
scale  with  larger  exponents  than  Bz,  Kappa  and  E(N). 

5.3  Problems  with  Non-Uniform  Constraint  Tightness 

All  experiments  considered  above  have  constraints  generated  uniformly.  That  is, 
a  single  value  of  p2  describes  the  tightness  of  every  constraint.  At  the  start  of 
search,  every  constraint  is  equally  tight,  so  a  good  measure  of  the  constrained¬ 
ness  of  a  variable  is  simply  the  number  of  constraints  involving  this  variable 
(i.e.  the  variable’s  degree),  together  with  its  domain  size.  Even  as  we  progress 
through  search  and  tightnesses  vary,  this  measure  should  still  be  reasonably  ac¬ 
curate.  This  might  explain  why  Bz  has  never  been  significantly  worse  in  earlier 
experiments  than  Kappa  or  E{N)  which  undertake  the  computationally  heavy 
overhead  of  measuring  exact  constraint  tightnesses. 

If  we  are  given  a  problem  with  significantly  varying  constraint  tightnesses  we 
must  take  account  of  this  to  measure  constrainedness  accurately.  We  therefore 
expect  that  Bz  and  FF  may  perform  poorly  on  problems  with  varying  constraint 
tightnesses,  while  the  other  heuristics  should  perform  well,  because  they  do  take 
account  of  constraint  tightness.  To  test  this  hypothesis,  we  generated  problems 
with  mainly  loose  constraints,  but  a  small  number  of  very  tight  constraints.  We 
did  this  by  generating  problems  with  a  multiple  of  5  constraints,  and  choosing 
exactly  20%  of  these  constraints  to  have  tightness  p2  =  0.8  (i.e.  tight  constraints) 
and  the  remainder  tightness  p2  =  0.2  (i.e.  loose  constraints).  We  expect  Bz  to 
perform  poorly  on  these  problems  as  it  will  tie-break  on  the  number  of  constraints 
and  not  the  tightness  of  those  constraints  (the  more  significant  factor  in  this 
problem  class). 

We  set  n  =  30  and  m  =  10,  and  to  observe  a  phase  transition  we  varied  the 
constraint  graph  density,  p\  from  ^  to  1  in  steps  of  Results  are  plotted  in 
Figure  3.  The  50%  solubility  point  is  at  k  0.64  when  pi  =  p. 

Median  performance,  Figure  3(a),  shows  that  as  predicted  Kappa  and  E(N) 
do  well.  Most  significantly,  Bz  is  dominated  by  all  except  FF.  This  is  the  first  of 
our  experiments  so  far  where  Bz  has  been  shown  to  perform  relatively  poorly. 

Figure  3(b)  shows  the  75th  percentiles  for  the  five  heuristics  (i.e.  75%  of 
problems  took  less  than  the  plotted  amount  of  search  effort)  and  Figure  3(d) 
shows  worst  case.  We  see  that  at  the  75th  percentile  there  is  a  greater  difference 
between  the  heuristics,  suggesting  a  more  erratic  behaviour  from  FF  and  Bz. 
Mean  performance  (Figure  3(c))  and  worst  case  performance  (Figure  3(d))  shows 
the  existence  of  exceptionally  hard  problems  for  FF  and  Bz.  The  worst  case  for 
FF  was  26,545  million  consistency  checks  at  ac  «  0.39,  in  a  region  where  100%  of 
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(a)  Median  checks 


(c)  Mean  checks 


Kappa 

(b)  75%  checks 


Kappa 

(d)  Maximum  checks 


Fig.  3.  Performance  of  heuristics  on  n  =  30  and  m  =  10,  with  =  0.2  for  80%  of  the 
constraints,  and  p2  =  0.8  for  the  remainder.  Note  the  different  y-scales. 


problems  were  soluble.  This  was  8  orders  of  magnitude  worse  than  the  median 
of  659  checks  at  this  point,  and  took  87  hours  on  a  DEC  Alpha  200^/^^®. 

5.4  Problems  with  Non-Uniform  Domain  Size 

Unlike  the  other  four  heuristics,  Rho  completely  ignores  the  domain  sizes  and 
its  contribution  to  problem  constrainedness.  We  therefore  expect  that  the  Rho 
heuristic  will  do  poorly  on  problems  with  mixed  domain  sizes.  To  test  this  hy¬ 
pothesis,  we  generated  20  variable  problems,  giving  each  variable  a  domain  of 
size  10  with  probability  0.5  and  a  domain  of  size  20  otherwise.  We  denote  this  as 
m  =  {10,  20}.  To  observe  a  phase  transition,  we  fixed  the  constraint  density  pi 
at  0.5  and  varied  p2  from  0.01  to  0.99  in  steps  of  0.01,  generating  1,000  problems 
at  each  point.  We  plot  the  results  for  mean  checks  for  each  of  the  heuristics  in 
Figure  4.  As  predicted,  the  Rho  heuristic  performs  worse  than  in  the  previous 
problem  classes.  This  seems  to  reaffirm  the  worth  of  exploiting  information  on 
domain  sizes. 

6  Discussion 

Theory-based  heuristics  for  the  binary  CSP  are  presented  by  Nudel  [14],  based 
on  the  minimization  of  a  complexity  estimate,  namely  the  number  of  compound 
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Fig.  4.  Performance  of  FC-CBJ,  with  =  20,  m  —  {10,  20}  and  =  0.5 


labels  at  a  given  depth  in  the  search  tree.  Two  classes  of  heuristic  are  presented, 
global  and  local.  Global  heuristics  fix  the  instantiation  order  at  the  start  of 
search,  whereas  local  heuristics  take  account  of  information  made  available  dur¬ 
ing  search,  such  as  actual  domain  sizes  and  constraint  tightness.  NudePs  local 
heuristics  are  thus  dynamic  variable  ordering  (dvo)  heuristics.  Three  dvo  heuris¬ 
tics  are  presented,  /O2,  /O3,  and  /O4.  IO2  chooses  “next  below  a  node,  that 
variable  with  minimum  number  rrii  of  surviving  labels  after  forward  checking  at 
the  node”,  and  is  equivalent  to  FF.  Heuristic  /O3  tie-breaks  IO2  by  choosing 
the  variable  (with  smallest  domain)  that  most  constrains  future  variables,  and 
has  much  in  common  with  Bz.  /O4  stops  when  any  future  constraint  disallows 
all  tuples  across  that  constraint.  As  Nudel  says,  this  is  not  so  much  a  heuristic 
but  an  algorithmic  step.  7O4  is  implicit  in  heuristics  Rho,  E(N),  and  Kappa. 

It  is  interesting  to  contrast  our  approach  with  NudePs  as  both  give  theory- 
based  variable  ordering  heuristics.  Nudel  gives  measures  that  estimate  the  size 
of  the  remaining  search  tree,  and  then  constructs  heuristics  which  seek  to  min¬ 
imize  these  estimates.  We  have  not  related  our  measures  directly  to  the  search 
tree.  Instead  we  have  sought  to  move  into  areas  of  the  search  tree  likely  to  be 
unconstrained  and  therefore  have  solutions.  When  one  makes  certain  simplifica¬ 
tions,  both  approaches  can  result  in  the  same  heuristic  such  as  FF.  However,  the 
detailed  relationship  between  the  approaches  has  not  yet  been  fully  analysed. 

Feldman  and  Golumbic  [4]  applied  NudePs  heuristics  to  real-world  constraint 
satisfaction  problems.  Three  heuristics  are  presented,  one  for  a  backward  check¬ 
ing  algorithm  (BT),  and  two  for  a  forward  checking  algorithm  (FCl  and  FC2). 
All  three  heuristics  were  applied  as  global/static  orderings.  Heuristic  FCl  selects 
Vi  with  minimum  m*  where  is  tightness  of  the  constraint  acting 

between  Vi  and  future  variable  Vj.  This  corresponds  to  a  global  E(N)  ordering. 
Heuristic  FC2  takes  into  consideration  all  constraints,  and  selects  variable  Vi 
with  minimum  rrii  —  Pj,k)-  As  far  as  we  can  see,  there  is  no  corre¬ 

spondence  between  FC2  and  the  heuristics  presented  here.  In  their  experiments 
heuristic  FCl  dominated  FC2  on  hard  problems. 

The  new  dvo  heuristics  presented  here  may  be  used  as  global/static  vari- 
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able  ordering  heuristics.  When  we  have  uniform  constraint  tightness,  Rho  will 
correspond  to  a  reverse  maximum  cardinality  ordering  [3] ,  suitable  for  forward 
checking  algorithms.  If  all  variables  have  the  same  constraint  tightness  then  E(N) 
maximizes  N  (the  FF  heuristic),  and  if  all  variables  have  the  same  domain  size 
E(N)  simplifies  to  maximizing/?  (the  Rho  heuristic).  Like  the  E(N)  heuristic,  the 
Kappa  heuristic  simplifies  to  maximizing  M  (the  FF  heuristic)  if  all  variables 
have  the  same  constraint  tightness  and  to  maximizing  p  (the  Rho  heuristic)  if  all 
variable  have  the  same  domain  size.  Clearly,  FF  and  Bz  can  be  considered  as  low 
cost  surrogates  of  the  minimize  Kappa  heuristic;  both  attempt  to  minimize  (11) 
by  maximizing  the  denominator,  and  Bz  tie-breaks  by  estimating  the  numerator 
of  (11)  by  assuming  all  constraints  are  of  the  same  tightness. 

7  Conclusions 

Three  new  variable  ordering  heuristics  for  the  CSP  have  been  presented,  namely 
E(N),  Rho,  and  Kappa.  These  new  heuristics  are  a  product  of  our  investigations 
into  phase  transition  phenomena  in  combinatorial  problems.  The  new  heuristics 
have  two  properties  in  common.  Firstly,  they  all  attempt  to  measure  the  con¬ 
strainedness  of  a  subproblem,  and  secondly,  they  attempt  to  branch  on  the  most 
constrained  variable  giving  the  least  constrained  subproblem.  The  heuristics  dif¬ 
fer  in  how  they  measure  constrainedness,  and  what  information  they  exploit. 

The  new  heuristics  have  been  tested  alongside  two  existing  heuristics,  namely 
Fail-First  (FF)  and  Brelaz  (Bz),  and  on  a  variety  of  uniform  and  non-uniform 
problems,  using  a  forward  checking  algorithm  FC-CBJ.  On  uniform  problems, 
the  new  heuristics  perform  similarly  to  each  other  and  dominate  FF.  Bz  was 
consistently  better  on  sparse  and  moderately  dense  constraint  graphs,  and  was 
easier  to  calculate.  As  constraint  graph  density  increased  to  the  point  of  becom¬ 
ing  a  clique,  Bz  performance  degraded  to  be  the  same  as  FF.  With  respect  to 
problem  size,  the  new  heuristics  appear  to  scale  better  than  FF  and  Bz. 

Problems  with  non-uniform  constraint  tightnesses  exposed  poor  behaviour 
from  Bz.  This  was  expected,  because  Bz  exploits  information  from  the  domain 
sizes  and  topology  of  the  constraint  graph,  but  ignores  the  tightness  of  con¬ 
straints.  Experiments  on  problems  with  non-uniform  domains  demonstrated  that 
ignoring  information  of  domain  sizes  results  in  poor  performance. 

In  some  respects  the  work  reported  here  might  be  considered  as  a  first  foray 
into  a  better  understanding  of  what  makes  heuristics  work.  Further  work  could 
include  determining  the  importance  of  tie-breaking  in  the  heuristic  Bz,  compared 
to  simply  choosing  the  first  variable  sensibly.  Faster  substitutes  for  the  heuristics 
would  allow  us  to  investigate  the  hypothesis  that  the  new  heuristics  scale  better 
than  the  old.  Little  has  been  done  to  compare  the  ranking  of  the  new  heuristics  on 
an  individual  problem  basis.  We  would  also  like  to  investigate  the  performance 
of  the  new  heuristics  in  problems  where  there  is  a  very  large  set  of  different 
domain  sizes  at  the  start  of  search. 
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Abstract.  The  goal  of  this  paper  is  twofold.  First,  we  introduce  a  class 
of  local  search  procedures  for  solving  optimization  and  constraint  prob¬ 
lems.  These  procedures  are  based  on  various  heuristics  for  choosing  vari¬ 
ables  and  values  in  order  to  examine  a  general  neighborhood.  Second, 
four  combinations  of  heuristics  are  empirically  evaluated  by  using  the 
graph-coloring  problem  and  a  real  world  application  -  the  frequency  as¬ 
signment  problem.  The  results  are  also  compared  with  those  obtained 
with  other  approaches  including  simulated  annealing,  Tabu  search,  con¬ 
straint  programming  and  heuristic  graph  coloring  algorithms.  Empirical 
evidence  shows  the  benefits  of  this  class  of  local  search  procedures  for 
solving  large  and  hard  instances. 

Keywords:  Local  search,  constraint  solving,  combinatorial  optimiza¬ 
tion,  graph  coloring,  frequency  assignment. 


1  Introduction 

Constraint  problems  embodies  a  class  of  general  problems  which  are  impor¬ 
tant  both  in  theory  and  in  practice.  Well-known  examples  include  constraint 
satisfaction  problems  (CSP),  maximal  constraint  satisfaction  problems  (MCSP) 
[5]  and  constraint  satisfaction  optimization  problems  (CSOP)  [18].  Constraint 
problems  can  be  considered  as  search  problems,  i.e.  given  a  finite  search  space 
composed  of  a  set  of  configurations,  we  want  to  find  one  or  more  particular  con¬ 
figurations  which  minimize  (or  maximize)  certain  pre-defined  criteria.  Constraint 
problems  have  many  practical  applications  related  to  scheduling,  transportation, 
layout /circuit  design,  telecommunications  and  so  on.  In  general,  constraint  prob¬ 
lems  are  NP-hard.  Consequently,  there  is  little  hope  of  finding  any  deterministic 
polynomial  solution  for  this  class  of  problems.  Given  their  practical  importance, 
many  methods  have  been  devised  to  tackle  these  search  problems.  This  paper 
looks  at  one  class  of  methods  which  are  based  on  surprisingly  simple,  yet  pow¬ 
erful  local  search  techniques. 

*  Work  partially  supported  by  the  CNET  (French  National  Research  Center  for 
Telecommunications)  under  the  grant  No.940B006-01. 
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Local  search  (LS),  also  called  neighborhood  search,  constitutes  an  important 
class  of  general  heuristic  methods  based  on  the  notion  of  neighborhood  [14].  S- 
tarting  with  an  initial  configuration,  a  typical  LS  procedure  replaces  iteratively 
the  current  configuration  by  one  of  its  neighbors,  which  is  often  of  better  quality, 
until  some  stop  criteria  are  verified;  for  example,  a  fixed  number  of  iterations  is 
reached  or  a  sufficiently  good  local  optimum  is  found.  Well-known  examples  of 
LS-based  methods  include  simulated  annealing  (SA)  [11],  Tabu  search  [6]  and 
various  forms  of  hill-climbers  [1].  Given  that  LS  uses  only  a  neighborhood  func¬ 
tion  and  possibly  some  other  general  notions,  it  can  be  applied  to  a  large  class  of 
problems.  Traditionally,  LS  was  used  with  success  to  tackle  well-known  NP-hard 
combinatorial  optimization  problems  such  as  TSP  [12]  and  graph  partitioning 
[10].  More  recent  applications  of  LS  include  the  graph  coloring  problem  [8,  9,  3], 
CSPs  [13],  and  the  satisfiability  problem  [7,  16]. 

Local  search  is  essentially  bcised  on  three  components:  a  configuration  struc¬ 
ture  (encoding)^  a  neighborhood  function  defined  on  the  configuration  structure, 
and  a  neighborhood  examination  mechanism.  The  first  component  defines  the 
search  space  S  of  the  application,  the  second  associates  a  subset  of  S  with 
each  point  of  the  search  space  while  the  third  defines  the  way  of  going  from  one 
point  to  another.  The  configuration  structure  is  often  application-dependent  and 
should  be  chosen  in  such  a  way  that  it  reflects  the  natural  solution  space  of  the 
problem  and  facilitates  its  exploration  and  exploitation.  For  a  given  neighbor¬ 
hood,  the  way  in  which  neighbors  are  examined  is  certainly  the  most  determinant 
part  of  the  performance  of  a  LS  procedure. 

In  this  paper,  we  present  a  class  of  LS  procedures  for  solving  optimization 
and  constraint  problems.  These  LS  procedures  are  based  on  different  heuristics 
for  examining  a  general  neighborhood.  Some  heuristics  are  well-known  and  oth¬ 
ers  less  so.  Computational  tests  are  carried  out  on  two  NP-hard  problems:  graph 
coloring  and  frequency  assignment  in  mobile  radio  networks.  Experimental  evi¬ 
dence  shows  the  benefits  of  these  procedures  for  solving  large  and  hard  instances. 
The  results  are  compared  with  those  obtained  by  two  other  LS  procedures:  SA 
and  Tabu  search  and  two  other  heuristic  methods:  constraint  programming  and 
heuristic  graph  coloring  algorithms. 

2  Constraint  Problems  and  Local  Search 

2.1  Constraint  Problems 

In  order  to  apply  LS  to  constraint  problems,  we  will  consider  constraint  problems 
as  combinatorial  optimization  problems,  i.e.,  a  constraint  problem  P  is  defined 
by  a  quadruple  <  X^D,C,  f  >  where 

-  X  =  {Vi,  V2---14}  is  all  the  distinct  variables  in  P, 

-  D  =  {Di,  D2...-Dn}  is  all  the  domains  of  variables, 

-  C  —  is  set  of  constraints,  each  Ci  being  a  relation  on 

-  /  is  a  cost  function  to  be  minimized  (maximized). 
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With  this  definition,  several  cases  are  possible.  First,  if  P  is  a  standard  CSP, 
there  will  be  no  associated  cost  function.  Therefore,  any  assignment  of  values  to 
the  variables  satisfying  the  constraints  of  C  will  be  a  solution.  Second,  if  P  is  a 
CSOP,  then  solving  P  implies  finding  assignments  such  that  all  the  constraints 
are  satisfied  and  the  cost  /  minimized  (or  maximized).  Third,  if  P  is  a  MCSP, 
i.e.  the  underlying  CSP  <  X,  P),  C  >  is  not  satisfiable,  then  solving  P  is  to 
maximize  (minimize)  the  number  of  satisfied  (unsatisfied)  constraints. 

Note  that  in  this  formulation,  the  cost  function  /  is  not  necessarily  the  eval¬ 
uation  function  which  is  required  by  any  LS  search  procedure  to  evaluate  the 
quality  of  configurations. 

2.2  Configuration  and  Neighborhood 

Given  a  constraint  problem  P  =<  X,D,CJ  >,  the  configuration  structure 
is  defined  as  follows:  a  configuration  s  is  any  complete  assignment  such  that 
^  =  {<  Vi,  Vi  >  \  Vi  E  X  and  Vi  E  A}-  We  use  T{s,  Vi)  to  note  the  value  of  Vi  in 
s,  i.e.  if  <  Vi,  Vi  >Es  then  T(s,  Vi)  =  v*.  The  search  space  S  of  the  problem  P  is 
then  defined  as  the  set  of  all  the  possible  configurations.  Clearly  the  cardinality 
of  S  is  equal  to  the  product  of  the  size  of  the  domains,  i.e.  n”=jAi. 

In  general,  the  configuration  structure  is  application-dependent.  However, 
some  configuration  structures  are  general  enough  to  be  applied  to  many  applica¬ 
tions.  The  above  structure  is  such  an  example.  In  fact,  it  can  be  used  to  model 
problems  such  as  graph  coloring,  satisfiability  and  many  CSPs.  Another  general 
configuration  structure  is  “permutation”  which  can  be  used  to  model  naturally 
the  traveling  salesman  problem. 

Given  the  configuration  structure  defined  above,  the  neighborhood  function 
N  :  S  2^  may  be  defined  in  many  ways.  In  this  paper,  we  use  the  one- 
difference  or  one-move  neighborhood^.  Formally,  let  s  e  5  be  a  configuration, 
then  s'  E  N{s)  if  and  only  if  there  exists  one  and  only  one  i  E  [l..n]  such 
that  T{s,  Vi)  ^  T(s',  Vi).  In  other  words,  a  neighbor  of  a  configuration  s  can  be 
obtained  by  changing  the  current  value  of  a  variable  in  s.  Since  a  variable  Vi  in 
s  can  take  any  of  its  |A|  values,  s  has  exactly  Ya=i  (|A|  -  1)  =  A  -  n 

neighbors.  Note  that  this  neighborhood  is  a  reflexive  and  symmetric  relation. 

2.3  Neighborhood  Examination 

We  now  turn  to  different  ways  of  examining  the  neighborhood,  i.e.,  going  from 
one  configuration  to  another.  In  this  paper,  we  use  a  two-step,  heuristic-based 
examination  procedure  to  perform  a  move. 

1.  first  choose  a  variable, 

2.  and  then  choose  a  value  for  the  selected  variable. 

It  is  easy  to  see  that  many  heuristics  are  possible  for  both  choices.  In  what  fol¬ 
lows,  we  present  various  heuristics  for  choosing  variables  and  values. 


^  In  general,  a  k-move  neighborhood  can  be  defined. 


197 


Heuristics  for  choosing  the  variable  Vii 

-  var.l  random:  pick  randomly  a  variable  Vi  from  X\ 

-  var.2  conflict-random:  pick  randomly  a  variable  from  the  conflict  set  defined 

{K*  I  Vf  ^  is  implicated  in  an  unsatisfied  constraint}; 

-  var.3  most- constrained:  pick  a  most  constrained  variable,  for  instance,  the 
one  which  occurs  in  the  biggest  number  of  unsatisfied  constraints  (break  ties 
randomly). 

Heuristics  for  choosing  the  value  for  V^-: 

-  val.l  random:  pick  randomly  a  value  from  Di\ 

-  vaL2  best-one  (min-conflicts[13]):  pick  a  value  which  gives  the  greatest  im¬ 
provement  in  the  evaluation  function  (break  ties  randomly).  If  no  such  value 
exists,  pick  randomly  a  value  which  does  not  lead  to  a  deterioration  in  the 
evaluation  function  (the  current  value  of  the  variable  may  be  picked); 

-  val.3  stochastic-best-one:  pick  a  best,  different  value  which  does  not  lead 
to  a  deterioration  in  the  evaluation  function.  If  no  such  value  exists,  with 
probability  p,  take  a  value  which  leads  to  the  smallest  deterioration; 

-  vaL4  first-improvement:  pick  the  first  value  which  improves  the  evaluation 
function.  If  no  such  value  exists,  take  a  value  which  does  not  lead  to  a 
deterioration  or  which  leads  to  the  smallest  deterioration^; 

-  vaL5  probabilistic-improvement:  with  probability  p,  apply  the  vail  random 
heuristics;  with  1  —  p,  apply  the  vaL2  best-one  heuristic. 

Let  us  note  first  that  val.2  forbids  deteriorative  moves.  As  we  will  see  later, 
this  property  will  penalize  its  performance  compared  with  others. 

Both  val3  and  val.5  use  a  probability  in  order  to  accept  deteriorative  moves. 
The  purpose  of  accepting  such  moves  is  to  prevent  the  search  from  being  stuck  in 
local  optima  by  changing  the  search  direction  from  time  to  time.  This  probabili¬ 
ty  may  be  static  or  dynamic.  A  static  probability  will  not  be  changed  during  the 
search  while  a  dynamic  one  may  be  modified  by  using  some  pre-defined  math¬ 
ematic  laws  or  may  adapt  itself  during  the  search.  In  any  case,  this  probability 
is  determined  more  often  empirically  than  theoretically. 

val.S  is  similar  to  the  random-walk  heuristic  used  by  GSAT  [17],  but  they  are 
different  since  for  the  satisfiability  problem,  there  is  only  one  explicit  choice,  i.e. 
the  choice  of  the  variable  to  flip.  Here  the  heuristic  determines  the  value  for  a 
chosen  variable,  not  the  variable  itself.  Another  interesting  point  is  that  varying 
the  probability  p  will  lead  to  different  heuristics.  If  p  =  1,  vaL5  becomes  val.l 
random.  If  p  =  0,  va/. 5  becomes  val.2  best-one.  Finally,  both  the  p  part  and  the 
1  —  p  part  can  be  replaced  by  other  heuristics. 

Evidently,  any  combination  of  a  var.x  heuristic  and  a  val.y  heuristic  gives  a 
different  strategy  for  examining  the  neighborhood.  There  are  15  possibilities  in 
our  case.  One  aim  of  this  work  is  to  assess  the  performance  of  these  combinations. 
Note  that  an  extensive  study  on  the  val.2  min-conflicts  heuristic  for  solving  CSPs 
hcis  been  carried  out  and  conclusions  have  been  drawn  [13].  However,  that  work 

^  As  for  val.3,  a  probability  can  be  introduced  here  to  control  deteriorative  moves. 
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concerns  essentially  the  value  choice  for  a  given  conflict  variable.  In  this  work, 
we  study  various  combinations  for  choosing  both  variables  and  values. 

Note  finally  that  in  order  to  efficiently  implement  the  above  variable/ value 
choice  heuristics,  special  data  structures  are  indispensable  to  be  able  to  recognize 
the  conflict  or  the  most  constrained  variables  and  the  appropriate  new  value  for 
the  chosen  variable. 


2.4  Heuristic  Local  Search  Template 

Using  the  above  variable/ value  choice  heuristics,  various  heuristic  local  search 
(HLS)  procedures  can  be  built.  The  general  HLS  template  is  given  below: 

Procedure  Heuristic  Local  Search  (HLS) 

Input: 

P  =<  X^D^C^f  >:  the  problem  to  be  solved; 
f  h  L\  objective  function  and  its  lower  bound  to  be  reached; 

MAX:  maximum  number  of  iterations  allowed; 

Output: 

s:  the  best  solution  found; 

begin 

generate(s);  /*  generate  an  initial  configuration  */ 

/  0;  /*  iterations  counter  */ 

while  (f{s)  >  L)  and  (I  <  MAX )  do 

choose  a  variable  Vi  6  X;  /*  heuristics  var.x  */ 
choose  a  value  vi  £  Di  for  Vi;  /*  heuristics  val.y  */ 
if  T{s,  Vi)  /  Vi  then 

|_  l/;,r(s,K)>}  +  {<  Vi, Vi  >}; 

_  /<“/+!; 
output  (s); 
end 

This  HLS  template  uses  two  parameters  L  and  MAX  in  the  stop  condition.  L 
fixes  the  (optimization)  objective  to  be  reached  and  MAX  the  maximum  number 
of  iterations  allowed.  Therefore,  the  procedure  stops  either  when  an  optimal 
solution  has  been  found  or  MAX  iterations  have  been  performed.  The  complexity 
of  such  a  procedure  depends  on  MAX^  the  size  of  domains  |Di|  and  the  way  in 
which  the  neighborhood  is  examined. 


3  Experimentation  and  Results 


In  this  section,  we  present  empirical  results  of  HLS  procedures  which  are  based  on 
some  representative  combinations  of  heuristics  introduced  above  for  exploiting 
the  neighborhood  structure.  Tests  are  carried  out  on  two  NP-complete  prob¬ 
lems:  the  graph- coloring  (COL)  and  the  frequency  assignment  (FAP).  For  the 
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COL  problem,  test  instances  come  essentially  from  the  archives  of  the  sec¬ 
ond  DIM  ACS  Implementation  Challenge'^.  For  the  FAP  problem,  instances  (60 
in  total)  are  provided  by  the  CNET  (French  National  Research  Center  for 
Telecommunications)®.  A  limited  number  of  instances  for  each  problem  are  used 
to  evaluate  four  combinations  of  heuristics  for  choosing  variables/ values.  More 
instances  are  then  solved  to  compare  these  HLS  procedures  with  other  approach¬ 
es  including  Tabu  search,  SA,  constraint  programming,  and  heuristic  graph  col¬ 
oring  algorithms. 


3.1  Tests  and  Problem  Encoding 
Graph  Coloring 

There  are  two  main  reasons  to  choose  the  graph- coloring  as  our  test  problem. 
First,  it  is  a  well-known  reference  for  NP-complete  problems.  Second,  there  are 
standard  benchmarks  in  the  public  domain. 

The  basic  COL  is  stated  as  a  decision  problem:  given  k  colors  and  a  graph 
G  =<  EyV  >,  is  it  possible  to  color  the  vertices  of  E  with  the  k  colors  in  such  a 
way  that  any  two  adjacent  vertices  of  V  have  different  colors.  In  practice,  one  is 
also  interested  in  the  optimization  version  of  the  problem:  given  a  graph  G,  find 
the  smallest  k  (the  chromatic  number)  with  which  there  is  a  A:-coloring  for  G. 

Many  classic  methods  for  graph-coloring  are  based  on  either  exhaustive 
search  such  as  branch- and-bound  techniques  or  successive  augmentation  heuris¬ 
tics  such  as  Brelaz’s  DSATUR  algorithm,  Leighton’s  Recursive  Largest  First 
(RLF)  algorithm  and  the  more  recent  XRLF  by  Johnson  et  al.  [9].  Recently, 
local  search  procedures  such  as  SA  [9]  and  Tabu  search  [8,  3]  were  also  applied 
to  the  coloring  problem. 

In  order  to  apply  our  HLS  procedures  to  the  graph-coloring  problem,  the 
COL  must  first  be  encoded.  Given  a  graph  G  =<  E,V  >,  we  transform  G  into 
the  following  constraint  problem  <  X^D,G,  f  >  where 

-  X  =  E  is  the  the  set  of  the  vertices  of  G, 

-  D  is  the  set  of  the  k  integers  representing  the  available  colors, 

-  C  is  the  set  of  constraints  specifying  that  the  colors  assigned  to  u  and  v 

must  be  different  if  {u,  u}  E  P, 

-  /  is  the  number  of  colors  used  to  obtain  a  proper  (conflict-free)  coloring. 

With  this  encoding,  a  (proper  or  improper)  coloring  of  G  will  be  a  complete 
assignment  {<  u,i  >  \u  £  E,i  E  D}.  Coloring  G  consists  in  cissigning  integers 
in  D  (colors)  to  the  vertices  in  E  in  such  a  way  that  all  the  constraints  of  C  are 
satisfied  while  a  minimum  number  of  k  colors  is  used. 

In  order  to  minimize  a  HLS  procedure  solves  a  series  of  CSPs  (decision 
problems).  More  precisely,  it  begins  with  a  big  k  and  tries  to  solve  the  underlying 

^  DIMACS  benchmarks  are  available  from  ftp  dimacs.rutgers.edu. 

^  Another  set  of  FAP  instances  are  available  from  ftp  ftp.cs.city.ac.uk.  These  tests 
correspond  to  sparse  graphs  and  are  consequently  much  less  constrained. 
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CSP  <  X,D,C  >.  If  di  proper  ^-coloring  is  found,  the  search  process  proceeds 
to  color  the  graph  with  k  —  1  colors  and  so  on.  If  no  coloring  can  be  found  with 
the  current  k  colors,  the  search  stops  and  reports  on  the  last  proper  coloring 
found.  In  other  words,  it  tries  to  solve  a  harder  problem  (CSP)  with  fewer  colors 
if  it  manages  to  solve  the  current  one.  In  order  to  solve  each  underlying  CSP, 
our  local  search  procedure  needs  an  evaluation  function  to  measure  the  relative 
quality  of  each  configuration  (which  is  usually  an  improper  coloring).  Several 
possibilities  exist  to  define  this  function.  In  this  paper,  it  is  defined  simply  as 
the  number  of  unsatisfied  constraints. 

The  DIMACS  archives  contain  more  than  50  benchmarks.  We  have  chosen  a 
subset  of  them  for  our  experiments.  Note  that  the  main  objective  of  this  work  is 
not  to  improve  on  the  best  known  results  for  these  instances.  In  order  to  achieve 
any  improvement,  local  search  alone  may  not  be  sufficient.  It  must  be  combined 
with  special  coloring  techniques.  In  this  work,  we  use  these  instances  to  study 
the  behavior  of  our  HLS  procedures.  For  this  purpose,  we  have  chosen  some  16 
small  and  medium  size  (<  500  vertices)  instances  from  different  classes. 

Frequency  Assignment  Problem 

Our  second  test  problem  concerns  a  real  world  application:  the  frequency  assign¬ 
ment  problem  which  is  a  key  application  in  mobile  radio  networks  engineering. 
As  we  will  see  later,  the  basic  FAP  can  be  easily  shown  to  be  NP-complete. 

The  main  goal  of  the  FAP  consists  in  finding  frequency  assignments  which 
minimize  the  number  of  frequencies  (or  channels)  used  in  the  assignment  and  the 
electro-magnetic  interference  (due  to  the  re-use  of  frequencies).  The  difficulty  of 
this  application  comes  from  the  fact  that  an  acceptable  solution  of  the  FAP  must 
satisfy  a  set  of  multiple  constraints,  some  of  these  constraints  being  orthogonal. 
The  most  severe  constraint  concerns  a  very  limited  radio  spectrum  consisting 
of  a  small  number  of  frequencies  (usually  about  60).  This  constraint  imposes 
a  high  degree  of  frequency  re-use,  which  in  turn  increases  the  probability  of 
frequency  interference.  In  addition  to  this  frequency  constraint,  two  other  types 
of  constraints  must  be  satisfied  to  ensure  communication  of  a  good  quality: 

1.  Traffic  constraints',  the  minimum  number  of  frequencies  required  by  each 
station  Si  to  cover  the  communications  of  the  station,  noted  by  T*. 

2.  Frequency  interference  constraints  belong  to  two  categories:  1.  Co-station 
constraints  which  specify  that  any  pair  of  frequencies  assigned  to  a  sta¬ 
tion  must  have  a  certain  distance  between  them  in  the  frequency  domain; 
2.  Adjacent-station  constraints  which  specify  that  the  frequencies  assigned 
to  two  adjacent  stations  must  be  sufficiently  separated.  Two  stations  are 
considered  as  adjacent  if  they  have  a  common  emission  area. 

About  60  instances  were  used  in  our  experiments.  Some  of  them  are  not  only 
very  large  in  terms  of  number  of  stations  and  in  terms  of  number  of  interference 
constraints,  but  also  very  difficult  to  solve.  The  FAP  instances  we  used  in  our 
experiments  belong  to  three  different  sets  which  are  specified  below. 
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-  Test-Set-No.l  Traffic  consirainis:  one  frequency  per  station.  Consequently, 
there  is  no  co-station  constraint.  Adjacent  constraints:  frequencies  assigned 
to  two  adjacent  stations  must  be  different. 

-  Test-Set-No. 2  Traffic  constraints:  two  frequencies  per  station.  Co-station 
constraints:  frequencies  assigned  to  a  station  must  have  a  minimum  distance 
of  3.  Adjacent  constraints:  frequencies  assigned  to  two  adjacent  stations  must 
have  a  minimum  distance  of  1  or  2  according  to  the  station. 

-  Test-Set-No. 3  Traffic  constraints:  up  to  4  frequencies  per  station.  Co¬ 
station  and  adjacent  constraints:  the  separation  distance  between  frequencies 
assigned  to  the  same  station  or  adjacent  stations  varies  from  2  to  4. 

In  fact,  Test-Set-No.l  corresponds  to  the  graph-coloring  problem.  To  see 
this,  frequencies  should  be  replaced  by  colors,  stations  by  vertices  and  adjacent 
constraints  by  edges.  Finding  an  optimal,  i.e.,  an  interference-free,  frequency 
assignment  using  a  minimum  number  of  distinct  frequencies  is  equivalent  to 
coloring  a  graph  with  a  minimum  number  of  colors. 

In  order  to  apply  our  HLS  procedures,  the  FAP  will  be  encoded  as  a  con¬ 
straint  (optimization)  problem  <  D,C,  f  >  such  that 

-  X  —  {Ti,L2,  ...,Lns}  where  each  Li  represents  a  list  of  Ti  frequencies  re¬ 
quired  by  the  station  and  NS  is  the  number  of  stations  in  the  network, 

-  D  is  the  set  of  the  NF  integers  representing  the  NF  available  frequencies, 

-  C  is  the  set  of  co-station  and  adjacent-station  interference  constraints, 

-  /  is  the  number  of  frequencies  used  to  obtain  conflict-free  assignments. 

It  is  easy  to  see  that  with  this  encoding,  a  frequency  assignment  has  the 
length  of  Ti.  Fig.l  gives  an  example  where  the  traffic  of  the  three  stations 
is  respectively  2,  1  and  4  frequencies  and  fij  represents  the  frequency  value 
of  the  2*^  station  Si. 


Cl  C2 


C3 


IfLl  In, 2  }f2.lif3.1  If3.2  If3.3  If3.4  I 


Fig.  1.  FAP  encoding 


Solving  the  FAP  consists  in  finding  assignments  which  satisfy  all  the  inter¬ 
ference  constraints  in  C  and  minimize  NF,  the  number  of  frequencies  used.  To 
minimize  NF,  we  use  the  same  technique  as  that  explained  in  the  last  section. 


3.2  Comparison  of  Strategies  for  Neighborhood  Examination 

We  have  chosen  to  compare  four  combinations  of  heuristics:  var.l  random/val.2 
conflict-random,  var.2  conflict-random /vaL 2  min-conflicts,  var.2  conflict-random/ 
val.3  stochastic-best- one,  var.2  conflict-random/val. 5  probabilistic-improvement. 
The  main  reason  for  choosing  these  strategies  is  to  have  a  good  sample  which 
combines  two  important  aspects  of  a  search  strategy:  randomness  and  guideness. 
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The  var.l  random/val.l  random  combination,  which  represents  a  random  search 
strategy,  was  also  tested.  The  results  of  this  strategy  were  so  bad  that  we  have 
decided  to  omit  them  from  the  comparison.  The  probability  p  used  in  var.2/vaL3 
is  respectively  fixed  at  0.1  for  the  COL  and  0.2  for  the  FAP.  The  probability  p 
used  in  var.2/val.5  is  fixed  at  0.05  for  both  problems. 

Table  1  shows  the  comparative  results  of  these  four  strategies  for  6  COL 
instances  (3  structured  graphs  /e450_15[c—  cfj.co/,  flatJlSJOxol  and  3  random 
graphs).  For  each  instance  and  each  strategy,  we  give  the  mean  number  of  colors 
(Colors)  over  5  runs  and  the  mean  number  of  evaluations  (Eval.)  in  thousands, 
needed  to  find  a  coloring.  These  two  criteria  refiect  respectively  the  quality  of  a 
solution  and  the  efficiency  of  a  strategy. 


Table  1.  Comparison  of  heuristics  for  COL 


Problems 

Runs 

1  varl/val2 

var2/val2 

var2/val3 

var2/val5 

[Colors 

Eval. 

Colors 

Eval. 

Colors 

[Colors 

Eval. 

Rl25.1c.col 

5 

mm 

1141 

47.80 

2939 

47.60 

128 

IB 

653 

R250.5.CO1 

5 

B 

9800 

70.60 

8500 

70.00 

5600 

mm 

1420 

DJSC250.5.col 

5 

n 

2000 

34.80 

2100 

31.60 

1240 

32.00 

1280 

le450_15c.col 

5 

22.60 

6173 

>25 

>5100 

21.60 

2574 

21.80 

2907 

le450_15d.col 

5 

22.60 

3789 

>25 

>5100 

21.40 

3094 

21.20 

3906 

flat300_28_0.col 

5 

35.00 

20368 

>38 

>7100 

34.00 

2832 

34.80 

4486 

From  Table  1,  we  observe  that  compared  with  the  other  strategies,  var.2 
conflict-random/val.2  min- conflicts  gives  the  worst  results.  In  fact,  for  5  out  of 
6  instances,  it  requires  more  colors  and  more  evaluations  than  the  others  to 
find  colorings.  In  particular,  it  has  serious  problems  coloring  the  3  structured 
graphs  even  with  up  to  10  colors  more  than  the  minimum,  var.l  random/val2 
conflict-random  is  a  little  better  than  var. 2/val. 2  ioi  5  instances,  but  worse  than 
var.2  conflict-random/val.3  stochastic-hest-one  and  var.2  conflict-random/val.5 
probabilistic-improvement.  Finally,  the  results  of  var.2/val.3  and  var. 2/val. 5  are 
similar  with  a  slightly  better  performance  for  the  first:  var. 2/val. 3  performs  a 
little  better  than  var.  2/val.  5  for  4  instances.  Therefore,  for  the  coloring  prob¬ 
lem,  it  seems  that  the  following  relation  holds:  {var. 2/val. 3  ^  var.2/ val.h)  > 
var.\/val.2  >  var. 2/val. 2^  where  ^  and  >  mean  respectively  “comparable”  and 
“better  than”  in  terms  of  the  solution-quality/efficiency.  This  relation  was  con¬ 
firmed  when  we  tried  to  solve  other  COL  instances. 

Table  2  shows  the  comparative  results  of  these  four  strategies  for  5  FAP  in¬ 
stances.  Each  instance  is  specified  by  three  numbers  nf.tr.nc  which  are  respec¬ 
tively  the  lower  bound  of  the  number  of  distinct  frequencies  necessary  to  have 
an  interference-free  assignment,  the  sum  total  of  the  traffic  (the  number  of  inte¬ 
ger  variables)  of  all  stations,  and  the  number  of  co-station  and  adjacent-station 
interference  constraints.  For  example,  16.300.12370  means  that  this  instance  has 
a  total  traffic  of  300  and  12,370  interference  constraints,  and  requires  at  least  16 
frequencies  to  have  an  interference-free  assignment.  The  same  criteria  as  for  the 
COL  are  used  except  that  “Colors”  criterion  is  replaced  by  the  mean  number  of 
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the  frequencies  (NF)  for  having  interference-free  assignments. 

From  Table  2,  we  see  once  again  that  var.2  conflict-random/vaL2  min- conflicts 
is  the  worst  strategy.  With  even  10  frequencies  more  than  the  lower  bound  (16),  it 
cannot  find  a  conflict-free  assignment  for  16.300.12370.  For  the  other  instances, 
it  requires  at  least  3  to  6  frequencies  more  than  the  lower  bound.  At  the  other 
extreme,  we  notice  that  var.2  conflici-random/val.3  stochastic-best- one  domi¬ 
nates  the  others  for  the  5  instances.  Finally,  the  performance  of  var.2  conflict- 
random/val.5  probabilistic-improvement  is  between  that  of  var.l  random/val.2 
conflict-random  and  var.2 jval.Z.  Therefore,  for  the  FAP,  it  seems  that  the  fol¬ 
lowing  relation  holds:  var.2/val.Z  >  var.2/val.h  >  var.\lval.2  >  var.2lvaL2. 
This  relation  was  confirmed  when  we  tried  to  solve  other  FAP  instances. 


Table  2.  Comparison  of  heuristics  for  FAP 


Problems 

Runs 

varl/val2 

var2/val2 

var2/val3 

var2/val5 

NF 

Eval. 

NF 

Eval. 

NF 

Eval. 

NF 

Eval. 

8.150.3308 

5 

8.0 

916 

14.00 

1229 

8.00 

47 

8.00 

50 

15.300.8940 

5 

16.00 

1769 

18.00 

741 

1804 

15.40 

2016 

16.150.3203 

5 

17.80 

1246 

21.80 

6731 

16.75 

10236 

17.60 

2638 

16.300.12370 

5 

18.75 

5978 

>26 

>1350 

2535 

19.20 

3120 

30.600.45872 

5 

30.00 

6513 

33.20 

11080 

570 

o 

o 

o 

1700 

Based  on  the  data  in  Tables  1  and  2,  we  can  make  several  remarks.  First, 
we  notice  that  var.2  conflict-random/val.3  stochastic-best-one  and  var.2  conflict- 
random/val.  5  probabilistic-improvement  turn  out  to  be  winners  for  both  problem- 
s  compared  with  var.l  random/val.2  min-conflicis  and  var.2  conflici-random/val. 2 
min- conflicts.  As  for  most  procedures  of  heuristic  search,  there  is  no  theoreti¬ 
cal  justification  for  this.  However,  intuitive  explanations  help  to  understand  the 
result.  The  tested  instances  represent  hard  problems  and  may  contain  many 
deep  local  optima.  For  a  search  procedure  to  have  a  chance  of  finding  a  solu¬ 
tion,  it  must  be  not  only  able  to  converge  efficiently  towards  optima,  but  also 
able  to  escape  from  local  optima.  Both  var.2/val.3  and  var.2/val.5  have  this 
capacity  thanks  to  deteriorative  moves  authorized  by  var.2/val.3  and  random 
moves  in  var.2/val.5.  On  the  contrary,  var.l  random/val.2  min- conflicts  and 
var.2  conflict-random/val.2  min-conflicts,  once  they  have  reached  a  local  opti¬ 
mum,  is  trapped  since  deteriorative  moves  are  forbidden.  Although  they  both 
allow  side- walk  moves  by  taking  values  that  do  not  change  the  value  of  the 
evaluation  function,  this  is  not  sufficient  to  escape  from  local  optima. 

If  we  trace  the  evolution  of  the  evaluation  function,  i.e.  the  number  of  un¬ 
satisfied  constraints  as  a  function  of  the  number  of  iterations,  we  observe  that 
all  four  strategies  are  able  to  reduce  rapidly  the  number  of  unsatisfied  con¬ 
straints  after  a  relatively  small  number  of  iterations  (descending  phase).  The 
difference  between  these  strategies  appears  during  the  following  phase.  In  fact, 
for  var.2/val.3  and  var.2/val.5,  the  search,  after  the  descending  phase,  goes  in¬ 
to  an  oscillation  phase  composed  of  a  long  series  of  up- down  moves,  while  for 
var.l/val. 2  and  var.2/val.2,  the  search  stagnates  at  plateaus. 

In  principle,  any  search  strategy  must  conciliate  two  complementary  aspects: 
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exploitation  and  exploration.  Exploitation  emphasizes  careful  examinations  of 
a  given  area  while  exploration  encourages  the  search  to  investigate  new  areas. 
From  this  point  of  view,  we  can  say  that  var,2/val.3  and  var.2/val.5  manage 
to  exploit  and  explore  correctly  the  search  space  by  balancing  randomness  and 
guideness.  On  the  contrary,  due  to  the  deterministic  nature  of  the  min- conflicts 
heuristic,  var.l/vaL2  and  var.2/vaL 2  focus  only  on  the  aspect  of  exploitation. 

3.3  Comparisons  with  Other  Methods 

This  section  lists  the  best  results  of  the  HLS  procedures  with  var.2/val.3  and 
var.2/val.5  for  the  COL  and  the  FAP.  The  controlling  probability  p  used  by 
var.2/val.3  varied  between  0.1  and  0.2.  The  probability  used  by  var.2/vaL5  was 
around  0.05.  Whenever  possible,  the  results  are  compared  with  those  obtained 
by  other  methods:  Tabu  search  for  the  COL,  SA,  constraint  programming  (CP) 
and  heuristic  coloring  algorithms  (HCA)  for  the  FAP. 

Results  are  based  on  1  to  10  independent  runs  according  to  the  difficulty  of 
the  instance.  For  each  run,  a  HLS  procedure  begins  with  k  colors/frequencies 
(usually  10  colors/frequencies  more  than  the  best  known  value)  and  tries  to  find 
a  conflict-free  solution  for  the  underlying  CSP.  For  each  given  color/frequency, 
t  =  1, 2,  3  tries  are  authorized,  i.e.  if  the  search  cannot  find  a  conflict-free  solution 
within  t  tries,  the  current  run  is  terminated.  If  a  conflict-free  solution  is  found 
(within  t  tries),  the  number  of  colors/frequencies  is  decreased  and  the  search 
continues.  The  maximum  number  of  iterations  (moves)  is  fixed  at  50,000  to 
500,000  for  each  try  in  each  run. 

Three  criteria  are  used  for  reporting  results:  the  minimum  and  the  mean 
number  of  colors/frequencies  (NC/NF),  and  the  mean  time  for  finding  a  conflict- 
free  solution.  The  first  two  criteria  reflect  the  quality  of  solutions  while  the  third 
reflects  solving  efficiency.  In  order  to  better  assess  the  robustness  of  HLS,  we 
also  give  in  brackets  how  many  times  a  solution  with  the  minimum  number  of 
colors/frequencies  has  been  found.  The  time  (on  a  SPARCstation  10)  is  the  total 
user  time  for  solving  an  instance,  including  the  time  for  solving  the  intermediate 
CSPs  and  the  time  for  failed  tries  (up  to  3  tries  for  each  run).  Note  that  timing 
is  given  here  only  as  a  rough  indication. 

Table  3  shows  the  results  of  the  HLS  procedures  for  some  DIMACS  bench¬ 
marks  and  two  classes  (^rlOO.b.co/  and  g^OO.b.col)  of  25  random  graphs  taken 
from  [4].  The  instances  are  specified  as  follows.  For  random  graphs  gxxx.yxol 
and  Rxxx.y.col,  xxx  and  y  indicate  respectively  the  number  of  vertices  and  the 
density  of  the  graph.  For  Leighton’s  graphs  lexxx-yy[a  —  d].col  and  structured 
graphs  flat^OO-yy-O.col,  yy  is  the  chromatic  number.  DSJCA000.bxol{res)  is 
the  residual  graph  (200  vertices  and  9,633  edges)  of  DS JC. 1000. bxol  (1,000  ver¬ 
tices  and  about  500,000  edges).  This  residual  graph  was  obtained  by  eliminating 
61  independent  sets  and  all  the  related  edges  [4].  Table  3  also  shows  the  results 
of  Tabu  search  (using  a  faster  SuperSPARC  50  machine)  [3]. 

The  top  part  of  Table  1  presents  results  for  graphs  having  up  to  300  vertices. 
From  the  data  in  Table  1,  we  notice  that  for  the  20  ^100. 5. co/  instances,  HLS 
finds  the  same  result  eis  that  of  Tabu  search  while  for  the  5  5r300.5.co/  instances. 
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it  needs  en  average  1.3  more  colors.  For  the  Rxxx.y.col  family,  HLS  manages 
to  find  a  best  known  coloring  for  5  out  of  6  instances,  but  ha^  difficulty  to 
color  JfJ125. 5. cof  for  which  it  requires  2  colors  more  than  Tabu.  For  the  three 
//at300_t/2/-0.co/  instances,  HLS  obtains  the  same  results  as  Tabu  search  for  2 
out  of  3  instances.  For  //ai300_26_0.co/,  HLS  finds  an  optimal  coloring  for  one 
out  of  5  runs.  For  D5JC'.1000.5.co/(res),  HLS  manages  to  color  the  graph  with 
24  colors  in  about  half  an  hour  (instead  of  the  best  known  value  of  23  obtained 
with  Tabu  after  about  20  hours  of  computing).  With  23  colors,  HLS  usually 
leaves  1  or  2  unsatisfied  constraints  at  the  end  of  its  search. 


Table  3.  HLS  performance  for  COL 


Problems 

Edges 

I  HLS 

Tabu 

Runs 

NC 

T[sec.] 

Runs 

NC 

T[sec.] 

Min.(nb) 

Ave. 

Min. 

Ave. 

glOO.S.col  (20  inst.) 

Ci  2500 

1 

14(1) 

14.95 

275.3 

1 

- 

14.95 

~  9.5 

g300.5.col  (5  inst.) 

~  22000 

1 

34(1) 

34.80 

4793 

1 

- 

33.50 

~  353 

R125.1.C01 

209 

10 

5(10) 

5.00 

0.5 

10 

5 

5.00 

0 

R125.1c.col 

7501 

10 

46(10) 

46.00 

129 

10 

46 

46.00 

4.1 

R125.5.C01 

3838 

10 

37(1) 

38.00 

187 

5 

35 

35.60 

1380 

R250.1.CO1 

867 

10 

8(10) 

8.00 

2 

10 

8 

8.00 

0 

R250.1c.col 

30227 

10 

64(8) 

64.20 

2946 

10 

64 

64.00 

108 

R250.5.col 

14849 

3 

69(1) 

69.75 

6763 

5 

69 

69.00 

1664 

flat300_20_0.col 

21375 

5 

20(5) 

20.00 

1997 

10 

20 

20.00 

40 

flat300_26_0.col 

21633 

5 

26(1) 

31.40 

6710 

3 

26 

26.00 

8100 

flat300_28_0.col 

21695 

5 

33(5) 

33.00 

2402 

3 

33 

33.00' 

4080 

DSJC.1000.5.col(res) 

9633 

5 

24(5) 

24.00 

1531 

5 

23 

23.00 

o 

o 

oo 

le450_15a.col 

8168 

5 

16(5) 

16.00 

354 

10 

15 

15.00 

248 

le450_15b.col 

8169 

5 

16(5) 

16.00 

273 

10 

15 

15.00 

248 

le450_15c.col 

16680 

5 

15(1) 

16.00 

4376 

10 

16 

16.00 

268 

le450_15d.col 

16750 

5 

15(2) 

15.60 

3990 

10 

16 

16.00 

791 

le450_25a.col 

8260 

5 

25(5) 

26.00 

61 

5 

25 

25.00 

4.0 

le450_25b.col 

8263 

5 

25(4) 

25.60 

27 

5 

25 

25.00 

3.9 

The  lower  part  of  Table  1  presents  the  results  for  6  Leighton  graphs.  We  see 
that  HLS  finds  an  optimal  coloring  for  4  out  of  6  instances.  For  /e450_15c.co/ 
and  /e450_15d. co/,  HLS  requires  one  less  color  than  Tabu.  For  /e450_15a.co/  and 
/e450_l 56. co/,  the  reverse  is  true.  It  should  be  mentioned  that,  in  order  to  color 
these  graphs  and  more  generally  any  graph  having  more  than  300  vertices.  Tabu 
uses  the  technique  of  independent  sets  mentioned  above  to  produce  first  a  much 
smaller  residual  graph  which  is  then  colored. 

Finally,  we  make  a  general  remark  about  the  HLS  procedures  used  to  obtain 
the  above  results.  The  main  goal  of  this  work  is  to  study  the  behavior  of  various 
heuristics,  but  not  to  improve  on  the  best  results  for  the  coloring  problem  (In 
fact,  this  second  point  constitutes  another  ongoing  work).  Consequently,  the  HLS 
procedures  used  are  voluntarily  general  and  do  not  incorporate  any  specialized 
technique  for  the  COL.  With  this  fact  in  mind,  the  results  of  the  HLS  procedures 
may  be  considered  to  be  very  encouraging. 
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Table  4  shows  the  best  results  for  12  FAP  instances  obtained  with  HLS 
procedures  combined  with  a  technique  for  handling  co-station  constraints  [2]. 
These  instances  are  taken  from  a  series  of  60  instances  and  represent  some  of 
the  hardest  problems.  It  should  be  remembered  that  each  instance  nf.tr.nc  is 
specified  by  the  lower  bound  for  the  number  of  frequencies,  the  sum  total  of  the 
traffic,  and  the  number  of  interference  constraints.  As  we  can  see  from  the  table, 
some  instances  are  very  large  and  have  a  high  density  of  constraints.  Table  4 
also  shows  the  best  results  of  SA,  CP  (ILOG-Solver)  and  HCA,  all  reported  in 
[15].  These  procedures  have  been  run  on  HP  Stations  which  are  considered  to 
be  at  least  three  times  faster  than  the  SPARCstation  10  we  used.  A  minus 
in  the  table  means  that  the  result  is  not  available. 


Table  4.  HLS  performance  for  FAP 


Problems 

HLS 

HCA 

r“cp 

SA 

Runs 

NF 

Eval. 

T[sec.] 

NF 

NF 

Tfsec.l 

NF 

T[sec.] 

Min.(nb) 

Ave. 

8.150.2200 

10 

8(10) 

8.00 

812 

639 

8 

8 

8 

509 

15.300.8940 

10 

15(10) 

15.00 

1326 

1606 

20 

17 

Hi 

15 

4788 

15.300.13400 

10 

15(10) 

15.00 

2557 

3600 

27 

25 

1560 

15 

2053 

16.150.3203 

10 

16(1) 

16.9 

1070 

658 

19 

18 

14400 

17 

1744 

16.150.3323 

10 

17(8) 

17.2 

3246 

2283 

19 

19 

7200 

17 

1383 

30.300.13638 

10 

30,00 

18 

25 

30 

30 

1 

30 

1558 

30.600.47852 

2 

30(1) 

30.50 

46666 

122734 

47 

46 

4800 

36 

5309 

40.335.11058 

2 

43(2) 

43.00 

9434 

17823 

- 

- 

- 

- 

- 

40,966.35104 

10 

KMIIUn 

707 

1341 

- 

- 

- 

_ 

_ 

60.600.47688 

10 

4234 

9288 

60 

- 

- 

60 

858 

60.600.45784 

10 

3981 

60 

- 

- 

60 

516 

In  Table  4,  we  notice  first  that  HCA  gives  the  worst  results  for  all  the  solved 
instances;  it  requires  up  to  17  frequencies  more  than  the  lower  bound.  The  results 
of  CP  are  a  little  better  than  HCA  for  6  out  of  7  problems.  It  is  interesting  to 
see  that  instances  which  are  difficult  for  HCA  remain  difficult  for  CP.  On  the 
contrary,  the  results  of  HLS  and  SA  are  much  better  than  those  of  HCA  and 
CP:  HLS  (SA)  finds  an  optimal  assignment  for  8  (6)  instances.  In  general,  the 
harder  the  problem  to  be  solved,  the  bigger  the  difference:  for  1 5. 300 J 3400  and 
30.600.47852,  there  is  a  difference  of  more  than  10  frequencies.  Note  finally  that 
HLS  improves  on  the  result  of  SA  for  two  instances  (in  bold).  In  particular,  for 
30.600.47852,  HLS  finds  a  conflict-free  assignment  with  only  30  frequencies  (36 
for  SA).  This  is  rather  surprising  given  the  similar  results  of  these  two  approaches 
for  the  other  instances.  The  computing  time  for  HLS  and  SA  is  generally  similar 
to  obtain  solutions  of  the  same  quality.  It  is  difficult  to  compare  the  computing 
time  with  CP  since  they  give  solutions  which  are  too  different. 

To  sum  up,  HLS  gives  the  best  result  for  the  12  hardest  FAP  instances.  This 
remains  true  for  48  other  instances  which  have  been  solved,  but  not  report¬ 
ed  here.  However,  we  notice  that  for  easier  instances,  all  the  methods  behave 
similarly.  Moreover,  CP  and  HCA  may  be  faster  for  some  easy  instances. 
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4  Conclusions  &  Future  Work 

In  this  paper,  we  have  presented  a  class  of  local  search  procedures  based  on 
heuristics  for  choosing  variables/ values.  The  combinations  of  these  heuristics  give 
different  strategies  for  examining  a  very  general  neighborhood.  These  heuristics 
can  be  applied  to  a  wide  range  of  problems. 

Four  representative  combinations  of  these  heuristics  out  of  fifty  possibili¬ 
ties  have  been  empirically  evaluated  and  compared  using  the  graph-coloring 
problem  and  the  frequency  assignment  problem.  Two  strategies  turn  out  to  be 
more  efficient:  var.2  conflict-random /val. 3  stochastic-best- one  and  var.2  conflict- 
random/vaLd  probabilistic-improvement.  In  essence,  these  two  strategies  are  able 
to  find  a  good  balance  between  randomness  and  guideness,  which  allows  them 
to  explore  and  exploit  correctly  the  search  space.  The  controlling  probability  p 
used  by  these  two  strategies  plays  an  important  role  in  their  performance  and 
should  be  empirically  determined.  In  our  experiments,  we  have  used  values  rang¬ 
ing  from  0.1  to  0.2  for  var.2/val.3  and  values  around  0.05  for  var.2/val5.  More 
work  is  needed  to  better  determine  these  values.  Moreover,  the  possibility  of  a 
self-adaptive  p  is  also  worth  studying. 

We  also  found  that  var.l  random/val.2  min-conflicts,  and  especially  var.2 
conflict-random/val.2  min- conflicts  are  rather  poor  strategies  for  both  COL  and 
FAP.  In  fact,  these  two  strategies  are  too  deterministic  to  be  able  to  escape  from 
local  optima.  It  is  interesting  to  contrast  this  finding  with  the  work  concerning 
the  min-conflicts  heuristic  [13]. 

To  further  evaluate  the  performance  of  these  heuristic  LS  procedures,  they 
were  tested  on  more  than  40  COL  benchmarks  and  12  hard  FAP  instances. 
Although  they  are  not  especially  tuned  for  the  coloring  problem,  the  HLS  pro¬ 
cedures  give  results  which  are  comparable  with  those  of  Tabu  search  for  many 
instances.  At  the  same  time,  we  noticed  that  HLS  alone,  like  many  other  pure 
LS  procedures,  has  difficulty  coloring  very  large  and  hard  graphs.  For  the  FAP 
problem,  the  results  of  HLS  on  the  tested  instances  are  at  least  as  good  as  those 
of  simulated  annealing,  and  much  better  than  those  obtained  with  constraint 
programming  and  heuristic  coloring  algorithms. 

Currently,  we  are  working  on  several  related  issues.  At  a  practical  level,  we 
want  to  evaluate  other  combinations  not  covered  in  this  paper.  Secondly,  we  are 
developing  specialized  coloring  algorithms  combining  the  general  heuristics  of 
this  paper  and  well-known  coloring  techniques.  Indeed,  in  order  to  color  hard 
graphs,  all  efficient  algorithms  use  specialized  techniques.  It  will  be  very  inter¬ 
esting  to  see  if  the  combination  of  our  heuristics  with  these  kinds  of  techniques 
can  lead  to  better  results. 

At  a  more  fundamental  level,  we  try  to  identify  the  characteristics  of  prob¬ 
lems  which  may  be  efficiently  exploited  by  a  given  heuristic.  This  is  based  on 
the  belief  that  a  heuristic  has  a  certain  capacity  to  exploit  special  structures  or 
characteristics  of  a  problem.  Thus,  the  heuristic  may  have  thus  some  “favorite” 
problems.  A  long-term  goal  of  the  work  is  to  look  for  a  better  understanding 
of  the  behavior  of  LS  heuristics  for  solving  problems  and  answering  such  funda¬ 
mental  questions  as  when  and  why  a  heuristic  works  or  does  not. 
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Abstract.  This  paper  presents  a  defeasible  constraint  solver  for  the  do¬ 
main  of  linear  equations,  disequations  and  inequalities  over  the  body  of 
rational/real  numbers.  As  extra  requirements  resulting  from  the  incor¬ 
poration  of  the  solver  into  an  Incremental  Hierarchical  Constraint  Solver 
(IHCS)  scenario  we  identified:  a)the  ability  to  refer  to  individual  con¬ 
straints  by  a  label,  b)  the  ability  to  report  the  (minimal)  cause  for  the 
unsatisfiability  of  a  set  of  constraints,  and  c)  the  ability  to  undo  the 
effects  of  a  formerly  activated  constraint. 

We  develop  the  new  functionalities  after  starting  the  presentation  with 
a  general  architecture  for  defeasible  constraint  solving,  through  a  solved 
form  algorithm  that  utilizes  a  generalized,  incremental  variant  of  the 
Simplex  algorithm,  where  the  domain  of  a  variable  can  be  restricted 
to  an  iu-bitrary  interval.  We  demonstrate  how  generalized  slacks  form 
the  basis  for  the  computation  of  explanations  regarding  the  cause  of 
unsatisfiability  and/or  ent ailment  in  terms  of  the  constraints  told,  and 
the  possible  deactivation  of  constraints  as  demanded  by  the  hierarchy 
handler. 


Keywords:  Constraint  Logic  Programming,  Linear  Programming,  Defeasible 
Constraint  Solving 


1  Introduction 

Although  Constraint  Logic  Programming  enhances  the  limited  expressive  power 
and  execution  efficiency  of  Logic  Programming,  it  is  insufficient  to  cope  with 
problems  for  which  many  solutions  might  satisfy  a  set  of  mandatory  (or  hard) 
constraints  of  the  problem,  but  where  some  solutions  are  preferred  to  others. 
In  this  case,  the  user  should  somehow  select  from  the  set  of  all  solutions  found 
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by  a  CLP  program  those  that  (s)he  prefers,  and  this  is  not  practical  when  the 
solution  set  is  large. 

An  alternative  approach  is  to  use  an  overconstrained  specification,  including 
both  hard  constraints  and  soft  constraints  (that  merely  specify  preferences),  and 
have  a  system  to  compute  the  solutions  that  satisfy  in  the  best  possible  way  a 
subset  of  these  preference  constraints.  This  was  the  approach  taken  by  [2,  16], 
that  proposed  an  HCLP  scheme  that  allows  non-required  (or  soft)  constraints 
to  be  specified  with  some  preference  level  and  rely  on  a  constraint  solver  that 
explores  this  hierarchy  of  constraints  to  detect  the  best  solutions. 

Although  the  scheme  is  quite  general,  little  details  were  published  on  the 
implementation  of  this  scheme.  In  [13,  14]  we  presented,  IHCS,  an  efficient  and 
incremental  defeasible  constraint  solver  that  is  used  as  the  kernel  of  an  HCLP 
instance  for  finite  domains.  The  key  points  of  our  implementation  were  a)  early 
detection  of  failures  through  the  use  of  the  usual  node-  and  arc-consistency  tech¬ 
niques  for  these  domains;  b)  detection  of  conflict  sets,  i.e.  the  sets  of  constraints 
responsible  for  the  failures  (this  is  done  by  keeping  dependencies  between  con¬ 
straints  through  shared  variables  by  adaptation  of  the  AC-5  algorithm  [12]);  c) 
selection  from  these  conflict  sets  of  constraints  that  should  be  relaxed,  together 
with  the  selection  of  constraints  (currently  relaxed  because  of  conflicts  with  the 
former)  that  can  now  be  safely  reactivated;  and  d)  defeating  the  constraints,  i.e. 
remove  the  effects  of  relaxed  constraints  avoiding  reevaluation  from  scratch. 

Although  our  scheme  could  in  principle  be  applied  to  any  other  domain,  early 
implementation  of  IHCS  did  not  separate  the  constraint  solver  (responsible  for 
the  detection  of  failures  and  their  causes)  from  the  hierarchy  manager  (responsi¬ 
ble  for  choosing,  given  some  preference  criterion,  which  constraints  to  relax  and 
which  to  reactivate). 

Moreover,  the  constraint  solver  detected  unsatisfiable  sets  of  constraints  by 
means  of  constraint  propagation  on  a  constraint  network,  and  the  method  is  not 
applicable  to  domains  that  do  not  use  this  representation  of  constraints.  This  is 
the  case  with  (linear)  constraint  solvers  over  the  reals/rationals  which  rely  on 
algebraic  methods  (e.g.  some  variant  of  the  Simplex  algorithm).  To  effectively 
apply  our  scheme  to  other  domains  it  was  thus  necessary  a)  to  clearly  separate 
the  constraint  solver  component  from  the  hierarchy  handler,  and  b)  to  enhance 
constraint  solvers  of  these  domains  to  cope  with  the  new  demands  of  defeasibility. 

These  requirements  are  twofold.  On  the  one  hand,  the  constraint  solver  must 
be  able  to  explain  the  cause  of  unsatisfiability  and/or  entailment  in  terms  of  the 
constraints  told.  On  the  other  hand,  it  must  be  able  to  cope  with  the  incremental 
activation  and  deactivation  of  constraints  as  demanded  by  the  hierarchy  handler. 

In  this  paper  we  propose  a  solution  to  these  extensions  for  CLP(Q),  a  linear 
constraint  solver  over  the  body  of  rational  numbers.  Interestingly,  both  exten¬ 
sions  can  be  realized  with  the  single,  simple  idea  of  generalized  slack  variables. 

The  following  sections  will  describe  a  general  architecture  for  defeasible  con¬ 
straint  solving,  recapture  the  working  of  the  traditional  Simplex  algorithm,  in¬ 
troduce  a  variant  through  the  generalization  of  slack  variables,  cover  the  identi¬ 
fication  of  minimal  conflict  sets,  and  derive  defeasibility. 
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2  An  Architecture  for  Defeasible  Constraint  Solving 

In  [13,  14]  an  Incremental  Hierarchical  Constraint  Solver  (IHCS(X,  2<))  is  pre¬ 
sented  as  a  general  framework  to  handle,  incrementally,  hierarchies  of  constraints 
in  some  domain  X  using  some  comparator  •;<.  In  these  papers,  only  the  instanti¬ 
ation  of  X  to  Finite  Domains  is  addressed,  and  there  is  no  clear  division  between 
the  component  that  handles  the  constraint  hierarchies,  the  comparators  used  to 
rank  solutions,  and  the  constraint  solvers  for  specific  domains. 

This  section  presents  the  new  architecture  of  IHCS  that  takes  into  account 
such  separation,  and  makes  it  truly  general  and  able  to  include  different  con¬ 
straint  solvers  (and  thus  different  domains).  Figure  1  depicts  this  general  archi¬ 
tecture  and  the  interface  among  the  separate  components. 


insert(+con5tramt@/et;e/,  —label) 
changeLevel(-}-/a6e/,  -^newLevel) 
Temove{-\-label) 
displayConstraint(-{-/a6€f) 


new^-i-consiratnt,  -\-label) 
activate(-f-/a6e/,  —CS) 
deactivate(-f /afee/) 
remove(-l-/a6e/) 
display  (-|-/a6e/) 


next(-}-  -1-^,  -^next) 


cs(a:) 


Comparators 

Library 


Fig.  1.  The  Architecture  of  IHCS(A',:;:<) 


The  functionality  IHCS  offers  to  any  system  requiring  defeasible  constraint 
handling  (in  our  case,  IHCS  is  embedded  in  a  Prolog  like  engine  to  yield  a 
HCLP(X,  <)  language)  is  displayed  at  the  top  of  the  Figure.  This  interface 
highlights  the  defeasible  nature  of  IHCS,  by  including  primitives  to  add  or  remove 
a  constraint  and  to  promote  or  demote  some  existing  constraint  (by  changing 
its  strength  or  hierarchical  level). 

The  Hierarchy  Manager  (HM)  is  responsible  for  demanding  the  activation 
and  relaxation  of  constraints,  so  as  to  maintain  the  best  solution.  Since  it  has 
no  specific  knowledge  of  the  domain  theory  X^  it  must  rely  on  some  specialized 
constraint  solver  CS(X)  to  check  satisfiability  of  constraints  on  domain  X.  The 
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insertion  of  a  new  constraint  into  CS(X)  is  made  with  predicate  {constraint^ 
label)  which  simply  creates  the  constraint  with  a  given  label  but  does  not  activate 
it.  Constraint  labels  are  required  for  future  reference  to  the  corresponding  com 
straints.  This  is  the  case  of  their  activation,  via  predicate  activate(/a6e/,  C5), 
where  CS  (the  conflict  set)  returns  the  set  of  constraints  responsible  for  the 
possible  unsatisfiability  of  the  active  constraints  (CS  is  of  course  empty  if  these 
constraints  are  satisfiable).  The  reason  why  constraints  are  created  and  activated 
with  different  interface  entries  is  that  IHCS  may  require  several  activations  or 
deactivations  (via  predicate  deactivate(/a6e/))  of  a  certain  constraint  during  the 
search  for  optimal  solutions.  A  deactivated  constraint  may  be  removed  from 
CS(A')  with  predicate  remove(/a6e/). 

A  library  of  comparators  includes  a  set  of  procedures  to  compute  the  next 
best  configuration  according  to  a  diversity  of  criteria.  Given  the  sets  of  con¬ 
straints  currently  active  and  relaxed  (the  current  configuration  ^),  and  a  conflict 
set  CS  (returned  by  the  CS{X)  component),  predicate  next(:<,  0,  CS,  ^next) 
computes  the  next  configuration  ^next  to  be  tried  according  to  comparator 
To  summarize,  and  to  comply  with  IHCS  requirements,  a  CS{X)  must  be: 

1.  Incremental  -  upon  the  activation  of  a  constraint  (demanded  through  pred¬ 
icate  activate(/a6e/,  CS)),  the  constraint  solver  must  check  the  satisfiability 
of  the  active  set  of  constraints  together  with  the  new  constraint; 

2.  Explanatory  -  once  unsatisfiability  is  detected,  its  causes  should  be  re¬ 
ported  to  the  HM  (as  a  conflict  set  CS)', 

3.  Defeasible  -  upon  the  deactivation  of  a  constraint  (demanded  through 
predicate  deactivate(/a6e/)),  the  effects  of  this  formerly  activated  constraint 
should  be  removed  avoiding  reevaluation  from  scratch. 

Of  course  all  the  above  requirements  impose  that,  the  labels  used  by  the  hi¬ 
erarchical  manager  to  refer  to  individual  constraints  are  shared  by  the  constraint 
solver. 

This  requirements  are  met  by  our  finite  domain  constraint  solver  described 
in  [14].  The  rest  of  this  paper  explains  the  changes  made  to  a  Constraint  Solver 
for  linear  constraints  over  rational/real  numbers,  namely  its  explanatory  and 
defeasibility  enhancements. 


3  Simplex  with  Generalized  Slack  Variables 

Classical  Simplex  [5]  deals  with  a  single  sort  of  slack  variables:  Si  >  0.  Free 
variables,  negative  variables  and  strict  inequalities  create  minor  problems,  some 
of  which  are  addressed  by  using  pairs  of  slacks. 

We  will  now  generalize  the  concept  of  slack  variables  by  allowing  for  arbitrary 
intervals  as  the  domains  of  variables.  This  covers  of  course  the  classical  Simplex 
slacks  with  a  non-strict  lower  bound  of  zero. 
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Example  1. 


Input  constraints 

Equations  with  classical  slacks 

X  <10 

X-^Si=  10 

X  <S 

X  +  52  =  8 

X>2 

-X  +  53  =  -2 

Instead  of  introducing  three  slack  variables  with  their  corresponding  rows  in 
the  Simplex  tableaux,  the  bounds  are  represented  as  attributes  of  the  affected 
variable  directly. 

Example  2. 


Input  constraints 

Representation  with  generalized  slacks 

X  <  10 

X  <  8 

X  >  2 

Xl2,8] 

The  next  section  deals  with  the  interaction  of  generalized  slacks  with  higher 
dimensional  constraints, 

3.1  Solved  Form  for  Inequalities  over  Bounded  Variables 

The  idea  proper  has  been  realized  a  long  time  ago  in  the  area  of  linear  pro¬ 
gramming  under  the  name  of  bounded  variable  linear  programs  [15],  In  bounded 
variable  linear  programs,  some  or  all  variables  are  restricted  to  lie  within  in¬ 
dividual  lower  and  upper  bounds.  Such  problems  can  of  course  be  solved  by 
including  all  bound  restrictions  as  constraints,  i.e.  rows  in  the  simplex  tableau. 
The  advantage  of  keeping  them  out  of  the  tableau  is  that  the  size  of  the  working 
basis  is  smaller.  Trivial  non-satisfiability,  redundancy  and  implicit  equalities  are 
detected  by  trivial  tests  of  0(1)  complexity.  Obviously  the  thread  matches  and 
advances  current  activities  in  the  CLP  area  that  try  to  restrict  the  use  of  general 
decision  methods  to  the  cases  where  they  are  unavoidable  [10]. 

We  formalize  bounded  variable  linear  programs  as: 

Minimize  cx 

subject  to  (l.a)  Ax  —  b  ..v 

(1-^)  Ij  <  for  jeJ  ^ 

(l.c)  Xi  unrestricted  for  i  fJ 

Where  Ax  =  b  denotes  the  subset  of  the  constraints  {ci\dim{ci)  >  1},  and 
inequalities  have  been  transformed  into  equations  through  the  introduction  of 
generalized  slack  variables.  A  feasible  solution  x  of  (1)  is  a  Basic  Feasible  Solution 
(BFS)  iff  the  set 

{Ai  ■  if h  <  }  U  {Aj  :j  /J}  (2) 

where  Aj  is  the  j-th  column  vector  of  A,  is  linearly  independent.  A  working  basis 
for  (1)  is  a  square,  nonsingular  sub  matrix  of  A  of  order  m.  Variables  associated 
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with  column  vectors  of  the  working  basis  will  be  called  basic  variables.  All  other 
variables  will  be  called  non-basic  variables.  It  is  clear  that  a  feasible  solution  x  is 
an  extreme  point  in  the  solution  space,  iff  there  exists  a  corresponding  working 
bcisis  with  the  property  that: 

1.  all  non-basic  variables  are  either  at  their  lower  or  upper  bound 

2.  the  basic  variables  are  within  their  bounds 

The  working  basis,  together  with  conditions  1  and  2  from  above,  constitutes  our 
proposed  solved  form  for  linear  inequalities  over  bounded  variables. 

Example  3. 


Input  constraints 

Solved  form 

rl  +  z2  +  2a;3  <  4, 
3x2  +  4x3  <  6, 

®1[0,2]  = 
a;3[o,.]  = 

1  H-  ^x2  +  |s2  —  si  1 
,  ^  ,  >  basis 

§-|x2-i*2  / 

0<xl,xl<2, 
0  <  x2,x2<  9, 
0  <  x3 

s2(o,.] 

==2(0,9] 

Notational  conventions:  a:l[o,2]  means  that  xl  has  a  (non-strict)  lower  bound  of 
zero  and  a  (non-strict)  upper  bound  of  two.  An  unspecified  bound  is  denoted  as 
in  ic3[o,_]j  where  we  have  no  finite  upper  bound.  The  active  bound  of  non-basic 
variables  is  denoted  by  overlining  as  in  x2^  gj.  If  you  insert  the  values  for  the 
active  bounds  into  the  right  hand  sides  (rhs)  of  the  equations  defining  the  basic 
variables  xl,xZj  you  will  find  that  the  resulting  values  for  xl,xZ  are  within  the 
respective  bounds.  Note  that  only  the  two  higher  dimensional  inequalities  led  to 
the  introduction  of  slack  variables  sl,s2. 

Algorithmic  Details:  Finding  the  solved  form  of  a  bounded  variable  linear 
program  can  be  rephrcised  as  a  search  problem,  where  we  have: 

1.  A  given  initial  state,  consisting  of  a  system  where  the  solved  form  invariants 
may  be  violated, 

2.  The  specification  of  a  solution  state  through  the  solved  form  invariants. 

3.  The  operators: 

(a)  pivot{xi^Xj) 

(b)  toggle.activeJ)ound{xi)  for  non-basic  variables 

The  non-determinism  in  the  selection  of  the  the  operators  and  their  arguments 
can  be  removed  by  the  same  rules  that  are  employed  in  the  original  Simplex 
algorithm: 

—  In  order  to  enter  the  basis,  the  type  of  the  non-basic  variable  must  be  com- 
patibe  with  the  sign  of  the  coefficient  of  the  variable 
in  the  objective  function 
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-  The  leaving  variable  corresponds  to  the  row  in  the  working  basis  that  imposes 
the  tightest  constraint  on  the  entering  variable 

-  The  active  bound  of  a  variable  may  be  toggled  if  the  tightest  constraint 
imposed  on  the  variable  through  the  working  basis  accomodates  the  change 

A  violation  of  the  solved  form  is  always  detected  by  locating  a  basic  variable 
that  is  out  of  its  bounds.  As  the  solved  form  will  be  computed  incrementally, 
there  will  always  be  at  most  one  such  row  and  it  will  correspond  to  the  m  +  l~st 
source  constraint. 

Theorem  1, 

1.  The  incretnenial  solved  form  algorithm  constitutes  a  decision  algorithm  for 
the  satisfiability  problem  of  a  polyheral  set 

2.  The  incremental  solved  form  algorithm  detects  implicitly  fixed  values 

Proof.  The  solved  form  obviously  satisfies 

Vfcci...m  k  <  inf{xk)  <  ^(xk)  <  sup{xk)  <  Uk  (3) 

To  establish  the  corresponding  relation  for  the  challenging  row  i  =  m  +  1,  we 
interpret  the  linear  combination  of  non-basic  variables  that  defines  the  basic 
variable  as  artificial  objective  function. 

j\xj^asis 

The  evaluation  of  Xi  with  respect  to  the  BFS  may  give  rise  to  a  re¬ 

pair  action  consisting  in  the  iterated  application  of  the  operators  pivot  and 
toggle^activeJbound  in  order  to  decrement  (increment)  (j>{xi)  until 

1.  li  <  ^{xi)  <  Ui'.  solved  form  established.  Satisfiable. 

2.  None  of  the  operators  is  applicable,  optimality,  thus 

<j>{xi)  =  inf{xi)  or  (f>{xi)  =  sup{xi)  (5) 

(a)  (t>{xi)  =  inf{xi)  >  u,-:  unsatisfiable 

(b)  (j>{xi)  =  inf{xi)  =  Ui  Xi  =  Uii  fixed  value 

(c)  <^(xi)  =  sup(xi)  <  li'.  unsatisfiable 

(d)  </>(xi)  =  sup(xi)  =:  li  Xi  =  U:  fixed  value 

□ 
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Practical  Details:  The  incremental  solved  form  for  bounded  variable  linear 
programs  forms  the  basis  for  the  implementation  of  the  CLP(6)  and  CLP(7^) 
systems  distributed  with  the  SICStus  and  Eclipse  Prolog.  The  coverage  is  at 
least  as  complete  as  that  of  earlier  CLP(7^)  implementations:  The  system  incre¬ 
mentally  solves  linear  equations  over  rational  or  real  valued  variables,  covers  the 
lazy  treatment  of  nonlinear  equations,  features  a  decision  algorithm  for  linear  in¬ 
equalities  that  detects  fixed  values,  removes  redundancies,  performs  projections 
(quantifier  elimination),  allows  for  linear  dis-equations,  and  provides  for  linear 
optimization. 

It  is  coded  in  Prolog  using  Attributed  Variables  [8]  which  serve  as  direct  access 
storage  locations  for  properties  associated  with  variables.  At  the  same  time, 
attributed  variables  make  the  unification  part  of  a  unification  based  language, 
Prolog  in  our  particular  case,  user-definable  within  the  language  under  extension 
[7,  9], 

Empirics:  In  the  following  table  we  list  the  execution  time  ratio  6/s  between 
the  solved  form  algorithm  using  bounded  variables  and  a  ^crippled’  version  which 
works  like  the  original  Simplex  with  one  slack  and  a  row  for  each  inequality.  The 
first  two  examples  are  from  [4]  computing  the  first  and  all  solutions  to  the  geo¬ 
metric  covering  problem  where  nine  squares  of  unknown  and  different  sizes  are 
required  fill  an  unspecified  rectangle,  the  remaining  ones  are  executions  of  a  very 
simple  minded  branch  and  bound  (BB)  code  on  top  of  our  solved  form  for  some 
of  the  smaller  examples  from  the  MIPLIB  mixed  integer  linear  programming 
examples.  Branch  and  bound  is  expected  to  benefit  from  generalized  slacks  be¬ 
cause  BB  strengthens  the  original  problem  relaxation  with  simple  inequalities 
like  X  <  and  x  >  [<^(a:)]  when  branching. 


example 

b/s 

9  squares  1st  0.78 

9  squares  all  0.76 

flugpl 

0.48 

stein  15 

0.68 

sample2 

0.70 

bm23 

0.73 

egout 

0.11 

enigma 

0.59 

modOlS 

0.42 

pipex 

0.57 

sentoy 

0.77 

On  this  collection,  the  solver  with  generalized  slacks  is  roughly  twice  as  fast, 
everything  else  held  constant:  same  solver  data  structures,  same  base  numeric 
(rationals),  same  machine.  Basically  the  same  ratios  are  obtained  when  comput¬ 
ing  with  floating  point  numbers. 
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4  Determining  Minimal  Conflict  Sets 

In  the  section  we  will  see  how  generalized  slacks  can  be  used  to  determine  the 
reasons  for  the  unsatisfiability  of  a  set  of  constraints.  We  distinguish  between 
equations,  inequalities  and  disequalities,  i.e.  constraints  of  the  form 

n 

aijXj  IX  bi  where  cx  e{=,  ^,  <,<,>,  >}  (6) 

i=i 

Interface:  In  the  sequel  we  need  the  ability  to  refer  to  individual  constraints 
by  a  symbolic  name  which  we  call  a  label.  Figure  1  depicts  the  interface  between 
the  CLP(Q)  solver  kernel  and  the  hierarchy  manager  part  of  IHCS(X, The 
activation  operation  supplies  the  solver  with  the  label  of  the  constraint  to  be 
activated  and  the  solver  returns  a  conflict  set  (CS)  where  the  label  is  an  abstract 
data  type  not  looked  at  by  the  solver,  and  the  CS  is  a  union  of  labels,  possibly 
empty. 


4.1  Failure  Analysis  for  Equations 

We  solve  systems  of  equations  by  Gaussian  elimination.  At  any  time,  the  set  of 
variables  is  partitioned  into  basic  and  non-basic  variables.  The  basic  variables 
are  expressed  in  terms  of  the  non-basics.  Upon  the  addition  of  a  constraint  it  is 
dereferenced  against  the  solved  form.  That  is,  references  to  the  basic  variables  are 
replaced  by  their  definitions.  If  the  resulting  expression  is  0  =  0,  the  constraint 
is  entailed.  The  constraint  is  in  conflict  with  the  solved  form  if  0  =  Ar,  Ar  ^  0. 
Otherwise,  we  solve  for  an  arbitrary  variable  in  the  dereferenced  expression  and 
add  this  definition  to  the  solved  form. 

We  extend  this  scheme  through  the  addition  of  a  imique  slack  variable  with 
bounds  [0, 0]  to  each  equation.  The  basis  for  the  validity  of  this  operation  is  that 
one  may  substitute  zero  at  any  time  for  all  such  variables  without  changing  the 
original  problem  statement.  We  call  this  special  sort  of  slack  variables  witness 
variables  after  [6]  where  the  very  same  trick  was  applied  for  a  completely  different 
purpose.  The  initial  coefficients  for  the  witness  variables  is  immaterial,  but  1  is 
convenient.  As  the  solved  form  is  manipulated,  the  coefficients  change,  and  the 
witness  variables  track  the  dependencies  between  the  constraints  which  originate 
from  dereferencing  and  from  pivot  operations. 


Example 


input  constraint(s) 

solved  form 

a-f  6=  10 

a  =  10  ~  6  —  it>i 

a-h6=  10 
a  —  b 

a  =  5  —  H-  ^W2 

6  =  5  —  ^wi  —  ^W2 

(7) 


The  two  equations  determine  the  values  for  a  and  b,  as  can  be  seen  by  substitut¬ 
ing  zero  for  Wi.  Adding  a  third,  incompatible  constraint  a  =  4  dereferences  into 
—  1  =  ~^wi  +  ^W2  +  W3.  That  is  —1  =0  and  the  culprits  are  identified  by  the 


witness  variables  removing  any  of  the  corresponding  equations  restores 

satisfiability. 

Proposition  2.  Witness  variables  with  nonzero  coefficients  in  a  dereferenced 
equation  identify  the  original  constraints  which  are  responsible  for  the  entailmeni 
or  unsatisfiabiliiy  of  the  equation. 

4.2  Failure  Analysis  for  Disequalities 

By  the  use  of  a  unique  slack  variables  we  turn  each 

n  n 

^aijXj  5^  bi  into  Y^aijXj  -  hi  -  s^z  (8) 

i=i  j=i 

where  Snz  may  assume  any  value  but  zero.  The  resulting  equation  is  dealt  with  as 
outlined  in  the  previous  section.  In  the  implementation,  an  obvious  optimization 
is  to  combine  the  slack  Sm  with  the  witness  variable  for  the  equation. 

4.3  Failure  Analysis  for  Inequalities 

After  the  solved  form  algorithm  fails  to  inc/decrease  the  row  that  violates  the 
solved  form,  as  described  in  section  (3.1),  the  following  holds: 

Propositions.  The  basic  variables  with  non^zero  coefficients  in  this  row  iden¬ 
tify  the  constraints  that  are  in  conflict  with  the  constraint  the  row  represents 
itself 

This  is  because  the  termination  condition  of  the  solved  form  algorithm,  i.e.  the 
non-applicability  of  the  operators  pivot  and  tog gle.active .bound  is,  like  in  the 
original  Simplex  algorithm  based  on  the  signs,  and  in  our  case  types,  of  the 
variables  in  the  objective  function.  Upon  termination,  the  value  of  the  objective 
function  is  known  and  a  corresponding  set  of  non-basic  variables  is  identified.  A 
stronger  result  is: 

Theorem  4.  Once  the  solved  form  algorithm  detects  unsatisfiability  in  (5),  the 
constraints  identified  by  the  non-basic  variables  with  nonzero  coefficients  consti¬ 
tute  a  minimal  inconsistent  subset  of  the  set  of  constraints. 

We  draw  upon  a  result  by  deBacker  [1],  which  extended  Lassez’s  Quasi-Dual 
results  [11]  based  on  Fourier’s  theorem. 

Theorems  Fourier  1827.  A  set  S  of  inequalities  is  inconsistent  iff  there  exists 
a  positive  linear  combination  of  inequalities  which  give  0  <  k,  where  A?  <  0. 
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Theorems  Lassez  1990.  For  S  as  defined  above  its  quasi  dual  is: 


-  If  Q  is  empty  then  S  is  solvable 

-  Otherwise  let  M  =  niinY^Xibi 

•  If  M  >  0,  S  is  solvable 

•  If  M  <  0,  S  is  unsolvable 

Theorem?  deBacker  1991.  When  S  is  unsolvable,  we  have  a  witness  vertex 
of  Q  corresponding  to  M.  A  subset  of  S  given  by  the  indices  A,-  ^  0  i5  a  minimal 
inconsistent  subset  of  S. 


Proof  of  theorem  4-  We  exhibit  the  correspondence  between  the  dual  problem 
that  arises  from  the  repair  action  in  the  solved  form  algorithm  for  an  unsatisfiable 
constraint  and  the  quasi  dual  formulation  for  the  whole  system  of  constraints. 
The  dual  problem  [15]  for  an  optimization  problem  in  standard  form 


{E?=i  „ 

maximize  CjXj 

Vyd-.m,  Ay  >  0 

minimize  \ 

The  quasi  dual  for  the  total  system  including  the  m  +  1-th  row  is 


(11) 


(12) 


EZV  =  0 

Qiotal  •  ^  Ai  =  1 

Viel..m -i- 1,  At  >  0 

The  dual  and  the  quasi  dual  are  related  by: 


j  el...n 


,  m  ^ 

^  \  “b^m+lCj  =  0 


(13) 


(14) 


Thus,  except  for  A,-  =  1,  the  dual  and  the  quasi  dual  correspond.  The  presence 
of  this  sum  in  the  quasi  dual  is  just  a  technical  trick  to  force  a  unique  solution 
to  the  minimization  problem  in  (10).  Therefore,  without  it,  and  because  in  (12) 
the  non-basic  A’s  are  zero  at  the  optimum,  the  A’s  correspond  under  the  scaling: 

^m+i  ”  ^  and  A,*  =  Ai/Ajn+i  (1^) 


which  yields  of  course  the  same  incidence  relation  regarding  minimal  inconsistent 
subsets  of  S.  □ 
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Example  5.  The  first  six  of  the  following  constraints  are  satisfiable.  The  addition 
of  the  seventh  results  in  unsatisfiability. 


source  constraint 

label 

x-|-3y+22  >  5, 

1 

2x  +  2y  -)■  z  >  2, 

2 

Ax  —  2y  +  Zz  >  — 1, 

3 

*  >  0, 

4 

!?>  0, 

5 

z  >  0, 

6 

6x  +  5y  +  2z  <  4 

7 

Put  into  standard  form,  the  qucisi  dual  reads: 

(-1-2 -4-1  0  0  6 

-3  -2  2  0  -1  0  5 

-2  -1  -3  0  0  -1  2 

ELi''<= 
minimize{~bXi  —  2X2  +  A3  +  4A7) 
gives  ;  A  =  (i,0,0,|,|,0,i) 

Taking  Asc,  +  |sc4  +  |sc5  +  jscr  results  in  0  <  —1,  where  scj  is  the  i-th  source 
constraint. 

In  our  solved  for  we  have  the  following:  after  the  addition  of  scy,  the  solved  form 
is  violated  at  the  corresponding  slack  variable  57[_  o],  where  the  upper  bound  is 
0  but  <^(57)  =  yI- 

(*)  S7[-,0] 

S2[.,0] 

^(0,-1 
®3[-,0] 

®[0.-] 

The  application  of  pzvot(p,  S3)  reduces  <^(57)  to  1,  but  the  solved  form  is  still 
violated.  None  of  the  variables  x,p,si  can  enter  the  basis,  thus  <^(57)  =  1  = 
2*71/(57). 

(+)  S7[_.o]  =  1  +  5a:  +  22/  -  si 
S2[-,0]  =  ^-§115^+531 

S3[.,o]  = 

=  (19) 

^10.-1 

*15,-1 

®i[-,ol 

Reading  off  the  coefficients  from  (*),  we  get:  A'  =  (— 1, 0, 0, 5, 2,0, 1)  Note  that 
the  first  component  of  this  vector  is  negative  because  we  assumed  normalized 


—  -L  75^  _  19c,  4.  J_co 
~  13,;^  13^*  13fl  ^  13f3 

—  — 15  _  22  _  I  S  -  _  _1_  Co 

-.J3  13^^  13^1  13^3 

-2  +  J|^:2«1+233 

"  13  13®  13^1  13^3  (18) 


I 
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inequalities,  i.e.  every  inequality  expressed  as  J^kijXi  <  6,  in  the  proof  only. 
Application  gives: 


-X  -  3y  -  22r  <  — 5 

5a?  ^  0  normalize 

2y>0 
6x  +  51/  +  2z  <  4 

5  Deactivation  of  Constraints 

Again  we  organize  this  section  after  the  classes  of  constraints  we  deal  with.  The 
basis  for  the  correctness  of  the  operations  performed  is  the  invertability  of  lin¬ 
ear  transformations.  Space  not  permitting  for  more  details,  we  only  mention  a) 
that  entailed  constraints  have  to  be  reactivated  if  no  longer  entailed  because 
of  the  deactivation  of  an  entailing  constraint,  and  b)  that  deactivation  has  of 
course  to  deal  with  the  transitive  closure  of  consequences  of  fixed  value  detec¬ 
tion/propagation. 


--X  —  Zy  2z  <  —5 
6x  -I-  5y  +  2z  <  4 


5.1  Deactivating  Equations 

An  equation  is  deactivated  by  solving  for  the  corresponding  witness  variable. 
Between  the  introduction  of  the  witness  variable  and  the  time  we  are  about  to 
relax  the  equation,  the  solved  form  was  changed  by  linear  transformations  only, 
which  is  the  guarantee  that  we  can  solve  for  the  witness  variable.  Solving  for  a 
variable  removes  it  from  all  the  right  hand  sides  of  all  basic  variables  where  it 
occurs,  and  then  we  may  abandon  the  row  for  the  witness  variable.  Consider  our 
example  again: 


Example  6. 


input  constraint(s) 

solved  form 

a-h6=  10 

a  =  10  —  6  —  tt;i 

a  4-  6  =  10 
a  =  b 

a  =  5  —  ^wi  4- 
6  =  5  —  ^wi  —  ^W2 

(21) 


To  deactivate  the  second  equation  a  =  6,  we  solve  for  W2  =  — 10  +  2a  +  iwi, 
substitute  and  drop  the  row  for  W2: 


Example  7. 


input  constraint(s) 

solved  form 

a  4- 6=  10 

9 

1 

a 

1 

0 

II 

•.0 

(22) 


Which  is  equivalent  to  the  solved  form  for  a  4-  6  =  10.  If  we  deactivate  the  first 
equation  instead,  we  have  wi  =  10  —  2a  -f  1^2  and: 


input  constraint (s) 

solved  form 

a  =  b 

6  =  a  —  u;2 

(23) 
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5,2  Deactivating  Disequations 

Recall  that  disequations  are  turned  into  equations  via  slacks.  Deactivating  a 
disequation  is  by  solving  for  the  witness  variable  for  the  corresponding  equation, 
and  dropping  the  row  afterwards. 


5.3  Deactivating  Inequations 

An  inequation  is  deactivated  by  bringing  the  associated  slack  variable  into  the 
basis  and  by  removing  the  corresponding  bound.  A  variable  is  brought  into  the 
basis  by  pivoting  it  with  the  most  constraining  row  in  the  basis.  If  there  is  no 
constraining  row  for  a  non-basic  variable,  we  may  simply  drop  the  bound  to  be 
deactivated. 

6  Conclusion 

It  turned  out  to  be  remarkable  simple  to  meet  the  explanatory  and  defeasibility 
requirements  of  the  IHCS  scenario  for  the  instantiation  of  the  constraint  solver 
component  to  CLP(Q).  One  concept,  generalized  slacks,  provides  both  mecha¬ 
nisms.  With  regard  to  computational  complexity,  explanations  are  for  free  if  we 
have  to  deal  with  inequalities  free  of  fixed  values  only.  If  there  are  (implicitly) 
fixed  values  and/or  additional  equations,  the  extra  cost  for  carrying  along  the 
witness  variables  is  rewarded  by  the  possibilities  a)  to  deactivate  the  constraints 
later,  and  b)  although  not  elaborated  on  here,  to  have  backtracking  without 
trailing  in  the  constraint  solver  [6].  Our  work  shares  objectives  with  [3].  Our 
improvement  is  in  the  addition  of  defeasibility  to  the  thread  and  that  we  don’t 
need  an  explicit  inverse  of  the  basis  for  CS  computations. 

With  respect  to  applications  we  naturally  envision  classical  dependency  di¬ 
rected  backtracking  applications,  but  expect  more  rewarding  results  from  the 
extra  expressiveness  and  flexibility  through  the  IHCS  architecture. 
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Abstract.  The  constraint  programming  community  has  recently  begun 
to  address  certain  types  of  optimization  problems.  These  problems  tend 
to  be  discrete  or  to  have  discrete  elements.  Although  sensitivity  analysis 
is  well  developed  for  continuous  problems,  progress  in  this  area  for  dis¬ 
crete  problems  has  been  limited.  This  paper  proposes  a  general  approach 
to  sensitivity  analysis  that  applies  to  both  continuous  and  discrete  prob¬ 
lems.  In  the  continuous  case,  particularly  in  linear  programming,  sensi¬ 
tivity  analysis  can  be  obtained  by  solving  a  dual  problem.  One  way  to 
broaden  this  result  is  to  generalize  the  classical  idea  of  a  dual  to  that  of 
an  “inference  dual,”  which  can  be  defined  for  any  optimization  problem. 
To  solve  the  inference  dual  is  to  obtain  a  proof  of  the  optimal  value  of 
the  problem.  Sensitivity  analysis  can  be  interpreted  as  an  analysis  of  the 
role  of  each  constraint  in  this  proof.  This  paper  shows  that  traditional 
sensitivity  analysis  for  linear  programming  is  a  special  case  of  this  ap¬ 
proach.  It  also  illustrates  how  the  approach  can  work  out  in  a  discrete 
problem  by  applying  it  to  0-1  linear  programming  (linear  pseudo-boolean 
optimization). 


1  Introduction 

Sensitivity  analysis  addresses  the  issue  of  how  much  the  solution  of  an  optimiza¬ 
tion  problem  responds  to  perturbations  in  the  problem  data.  It  is  an  indispens¬ 
able  element  of  applied  modeling,  perhaps  as  important  as  obtaining  the  solution 
itself.  It  is  needed  not  only  to  anticipate  the  effect  of  changes  in  the  problem,  but 
to  deal  with  the  fact  that  in  applied  work,  obtaining  the  information  necessary 
to  formulate  an  accurate  model  is  often  the  hardest  part  of  the  task.  Sensitivity 
analysis  typically  reveals  that  the  solution  depends  primarily  on  a  few  key  data, 
whereas  the  rest  of  the  problem  can  be  altered  somewhat  without  appreciable 
effect.  This  allows  one  to  focus  time  and  resources  on  collecting  and  verifying  the 
information  that  really  matters.  More  generally,  it  directs  the  decision  maker’s 
attention  to  those  aspects  of  the  problem  that  should  be  closely  watched. 

By  far  the  most  widely  used  optimization  tool  is  linear  programming,  for 
which  sensitivity  analysis  is  highly  developed.  One  of  the  basic  results  is  that  the 
solution  of  the  linear  programming  dual  indicates  the  sensitivity  of  the  optimal 
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value  to  perturbations  in  the  right-hand  sides  of  the  inequality  constraints.  These 
results  have  been  extended  to  certain  discrete  optimization  problems.  There  are 
duality  theories  for  integer  programming,  for  instance,  that  can  serve  as  a  basis 
for  sensitivity  analysis;  a  brief  survey  may  be  found  in  Section  23.7  of  [13]. 
Integer  dual  solutions  become  very  complex  as  the  problem  grows,  however,  and 
are  rarely  used  in  practice.  These  and  related  approaches  (e.g.,  [12,  14])  are 
based  on  an  investigation  of  how  the  optimal  value  depends  on  the  right-hand 
sides  of  inequality  and  equality  constraints,  and  it  is  unclear  how  the  ideas  would 
generalize  to  problems  with  other  types  of  constraints. 

The  approach  taken  here  is  to  define  the  inference  dual  of  an  optimization 
problem  and  use  it  as  the  basis  for  sensitivity  analysis.  The  inference  dual  is  the 
problem  of  inferring  from  the  constraints  a  best  possible  bound  on  the  optimal 
value.  The  solution  of  the  dual  is  a  proof,  using  an  inference  method  that  is 
appropriate  to  the  problem.  Sensitivity  analysis  can  be  viewed  as  an  analysis  of 
the  role  of  each  constraint  in  this  proof.  For  example,  a  constraint  may  not  even 
appear  as  a  premise  in  the  proof,  in  which  case  it  is  redundant,  or  if  it  does, 
the  proof  may  yet  go  through  if  the  constraint  is  weakened  by  a  determinable 
amount.  This  type  of  analysis  can  in  principle  be  carried  out  for  any  type  of 
constraint. 

There  may  be  several  proofs  of  an  optimal  bound,  and  if  so  sensitivity  anal¬ 
ysis  differs  somewhat  in  each.  This  phenomenon  is  known  as  “degeneracy”  in 
classical  mathematical  programming,  where  it  tends  to  be  regarded  as  a  techni¬ 
cal  nuisance.  Here  it  is  seen  to  be  a  natural  outcome  of  the  fact  that  more  than 
one  rationale  can  be  given  for  an  optimal  solution. 

To  solve  the  inference  dual,  one  must 

a)  identify  inference  rules  that  are  complete  for  the  type  of  constraints  in  the 

problem; 

b)  use  the  rules  to  prove  optimality. 

In  the  linear  programming  dual,  for  example,  one  infers  inequalities  from  other 
inequalities.  The  inference  rule  is  simple;  all  inequalities  implied  by  a  constraint 
set  can  be  obtained  by  taking  nonnegative  linear  combinations  of  the  constraints. 
This  is  essentially  the  content  of  the  classical  “separation  lemma”  for  linear  pro¬ 
gramming,  which  is  therefore  a  completeness  theorem  for  linear  inference.  To 
find  the  particular  linear  combination  that  solves  the  dual,  one  can  use  informa¬ 
tion  obtained  in  solving  the  original  problem  (the  “primal”).  The  multipliers  in 
this  combination  indicate  the  sensitivity  of  the  optimal  value  to  perturbations 
in  the  right-hand  sides  of  the  corresponding  constraints. 

The  question  here  is  how  to  address  points  (a)  and  (b)  for  discrete  opti¬ 
mization  problems.  One  answer  lies  in  the  idea  of  deriving  a  dual  solution  from 
information  gathered  while  solving  the  primal  problem.  A  discrete  problem  can 
be  solved  enumeratively  by  building  a  search  tree  that  branches  on  values  of  the 
variables.  The  structure  of  this  tree  reflects  the  structure  of  a  proof  of  optimality, 
in  the  following  way.  First,  a  constraint  is  added  to  require  the  objective  func¬ 
tion  value  to  be  less  than  the  true  minimum,  so  that  the  tree  now  establishes 
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infeasibility  of  the  augmented  constraint  set.  At  each  leaf  node  of  the  tree,  a 
certain  type  of  proposition  (a  “multivalent  clause”)  that  is  violated  at  that  node 
is  inferred  from  one  of  the  constraints.  The  tree  now  indicates  the  structure  of  a 
resolution  proof  that  the  multivalent  clauses  are  unsatisfiable,  using  a  “multiva¬ 
lent”  resolution  method  that  generalizes  classical  resolution.  The  inference  rules 
required  in  (a)  are  therefore  those  of  multivalent  resolution,  plus  those  needed  to 
infer  multivalent  clauses  from  constraints.  The  proof  required  in  (b)  is  given  by 
the  structure  of  the  search  tree.  Sensitivity  analysis  consists  generally  in  noting 
the  role  played  by  each  constraint  in  the  proof,  and  specifically  in  checking  how 
much  the  constraints  can  be  changed  so  that  they  still  imply  the  multivalent 
clauses  that  are  used  as  premises  in  the  proof. 

Inference  duality  also  permits  a  generalization  of  Benders  decomposition  to 
any  optimization  problem.  This  technique  allows  one  to  generate  “Benders  cuts” 
that  are  analogous  to  nogoods  but  that  exploit  problem  structure  in  a  way  that 
nogoods  do  not.  This  idea  is  developed  in  [7].  Other  connections  between  logical 
inference  and  optimization  are  surveyed  in  [4,  6]. 

The  paper  begins  with  an  elementary  treatment  of  the  inference  dual  and 
its  role  in  linear  programming  sensitivity  analysis.  It  then  describes  multivalent 
resolution  and  shows  how  an  optimality  proof  of  this  kind  can  be  recovered 
from  an  enumeration  tree  that  solves  the  primal  problem.  It  concludes  with  an 
application  to  linear  0-1  programming  and  some  practical  observations. 

2  The  Inference  Dual 

Consider  a  general  optimization  problem, 

min  f{x)  (1) 

s.t.  X  e  S 
X  e  D. 

The  domain  D  is  distinguished  from  the  feasible  set  S.  In  most  applications  x  is 
a  vector  {xi,...,Xn),  in  which  case  denotes  the  domain  of  xj. 

To  state  the  inference  dual  it  is  necessary  to  define  the  notion  of  implication 
with  respect  to  a  domain  D.  Let  P  and  Q  be  two  propositions  about  x]  that 

is,  their  truth  or  falsehood  is  determined  by  the  value  of  x.  P  implies  Q  with 

respect  to  D  (notated  P  Q)  \i  Q  is  true  for  any  ar  €  L)  for  which  P  is  true. 
The  inference  dual  of  (1)  is 

max  2:  (2) 

s.t.  x€S  f{x)  >  z 

So  the  dual  seeks  the  largest  2:  for  which  f{x)  >  z  can  he  inferred  from  the 
constraint  set.  A  strong  duality  theorem  is  true  almost  by  definition.  To  state 

it,  it  is  convenient  to  say  that  the  optimal  value  of  a  minimization  problem  is 
respectively  00  or  —00  when  the  problem  is  infeasible  or  unbounded,  and  vice- 
versa  for  a  maximization  problem. 
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Theorem  1  Strong  Inference  Duality,  The  optimization  problem  (1)  has  the 
same  optimal  value  as  its  inference  dual  (2). 

Proof.  If  2:*  is  the  optimal  value  of  (1),  then  clearly  x  £  S  implies  f{x)  >  2;*, 
which  shows  that  the  optimal  value  of  the  dual  is  at  least  2^*.  The  dual  cannot 
have  an  optimal  value  larger  than  2:*,  because  this  would  mean  that  f(x)  =  z* 
cannot  be  achieved  in  (1)  for  any  feasible  x.  If  (1)  is  infeasible,  then  any  2:  is 
feasible  in  (2),  which  therefore  has  optimal  value  00.  If  (1)  is  unbounded,  then 
(2)  is  infeasible  with  optimal  value  —  00.  □ 

Because  strong  duality  is  a  trivial  affair  for  inference  duality,  interesting 
duality  theorems  must  deal  with  some  other  aspect.  A  natural  task  for  a  duality 
theorem  is  to  provide  a  complete  method  for  deriving  inferences  in  the  dual,  as 
explained  earlier. 

A  dual  solution  provides  sensitivity  analysis  because  it  specifies  the  role  played 
by  each  constraint  in  a  deduction  of  the  optimal  value.  This  is  illustrated  below 
in  the  cases  of  linear  and  0-1  programming. 

3  Linear  Programming  Duality  and  Sensitivity  Analysis 

A  linear  programming  problem 

min  cx  (3) 

s.t.  Ax  >  a 

X  >0, 

where  matrix  A  is  m  x  n,  has  the  following  inference  dual. 

max  2:  (4) 

s.t.  {Ax  >  a,a:  >  0)  cx>z. 

The  dual  looks  for  a  linear  implication  cx>  z  oi  the  constraints  that  maximizes 
2:.  Linear  implication  is  characterized  by  the  following,  which  is  equivalent  to  a 
classical  separation  lemma  for  linear  programming. 

Theorem 2  Linear  implication..  Ax  >  a  linearly  implies  cx  >  z  (i.e.,  Ax  > 

a  — >  cx  >  z)  if  and  only  if  Ax  >  a  is  infeasible  or  there  is  a  real  vector  u>0 
for  which  uA  <  c  and  ua>z. 

This  means  that  the  dual  (4)  seeks  a  nonnegative  linear  combination  uAx  >  ua 
oi  Ax  >  a  that  dominates  cx>  z  (i.e.,  nA  <  c  and  ua  >  z)  and  that  maximizes 
2:.  So  the  dual  can  be  written  in  the  classical  way, 

max  ua  (5) 

s.t.  uA  <c 
w  >  0 

The  vector  u  £  in  effect  encodes  a  proof,  because  it  gives  instructions  for  de¬ 
ducing  the  optimal  value  2;.  The  duality  theorem  for  linear  programming  follows 
immediately  from  Theorem  2. 
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Corollary  3.  The  optimal  value  of  a  linear  programming  problem  (3)  is  the  same 
as  that  of  its  classical  dual  (5),  except  when  both  are  infeasible. 

Note  that  unlike  inference  duality,  the  classical  theorem  requires  a  regularity 
condition  to  the  effect  that  either  a  problem  or  its  dual  must  be  feasible. 

The  optimal  dual  solution  u*  provides  sensitivity  analysis  because  it  indicates 
the  role  of  each  constraint  in  a  proof  of  optimality.  For  instance,  if  =  0,  then 
constraint  i  has  no  role  in  the  proof,  and  the  constraint  can  be  omitted  without 
invalidating  the  proof  and  therefore  without  changing  the  optimal  value. 

More  generally,  one  can  reason  as  follows.  If  the  vector  a  of  right-hand  sides 
is  changed  to  a  -I-  zio,  then  u*  remains  a  feasible  solution  of  the  resulting  dual 
problem  (5)  with  value  u*{a  +  Aa).  So  the  optimal  dual  value  is  at  least  u*{a  + 
Aa),  and  by  Corollary  3  the  same  is  true  of  the  optimal  solution  of  the  primal 
problem . 

This  means  that  if  constraint  i  is  strengthened  by  raising  its  right-hand  side 
Gi  by  Aoi  >  0,  the  optimal  value  will  rise  at  least  u'^Aai.  (In  particular,  it  will 
rise  to  oo  if  the  problem  becomes  infeasible.)  If  Oi  is  reduced  by  Aai,  the  optimal 
value  can  fall  no  more  than  u'^Aoi.  The  increase  or  decrease  is  exactly  u’^^Aoi  if 
Aoi  lies  within  easily  computable  bounds. 

The  dual  problem  can  have  several  optimal  extreme  point  solutions  (i.e., 
optimal  solutions  that  are  not  convex  combinations  of  each  other).  This  can 
occur  when  the  primal  solution  is  “degenerate.”  Each  solution  gives  rise  to  a 
different  sensitivity  analysis. 

A  more  complete  exposition  of  linear  programming  sensitivity  analysis  may 
be  found  in  [2]. 

4  A  Multivalent  Resolution  Method 

It  is  well  known  that  the  resolution  method  originally  developed  by  Quine  [9,  10] 
(and  later  extended  to  first  order  logic  by  Robinson  [11])  provides  a  complete 
refutation  method  for  propositional  logic  in  conjunctive  normal  form.^ 

The  resolution  method  is  readily  generalized  to  problems  in  which  the  vari¬ 
able  domains  contain  more  than  two  discrete  values.  The  method  that  results  is 
related  to  Cooper’s  algorithm  for  achieving  A:- consistency  [3],  but  it  is  convenient 
here  to  cast  it  as  an  inference  method.  Consider  a  set  S  of  multivalent  clauses 
of  the  form, 

{xx  GT^i)V...V(arn 

where  each  Xj  6  is  a  literal,  and  each  Tij  C  D^j .  If  =  0,  then  the  literal 
Xj  G  Tij  can  be  omitted  from  the  disjunction,  and  it  is  convenient  to  say  that 
the  clause  does  not  contain  Xj.  The  resolvent  on  xj  of  the  clauses  in  5  is 

i  k^j  i 

^  Quine’s  method,  sometimes  called  consensus,  was  actually  for  formulas  in  disjunctive 
normal  form,  but  it  is  easily  dualized  to  treat  conjunctive  normal  form. 
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For  example,  the  first  three  clauses  below  resolve  on  a:i  to  produce  the  fourth. 
Here  each  xj  has  domain  {1,2, 3, 4}. 

(xi  e  {1,4})  V(X2  e  {1}) 

(xi  e{2,4})V(x2e{2,3}) 

(x,  6  {3,4})  V(X2  e  {1}) 

(xi  e{4})v(x2e{i,2,3}) 

To  check  a  multivalent  clause  set  S  for  satisfiability,  identify  a  subset  of 
clauses  whose  resolvent  does  not  already  belong  to  5.  If  there  is  no  such  subset, 
stop  and  conclude  that  S  is  sat isfi able.  If  the  resolvent  is  the  empty  clause  (i.e, 
each  Tj  =  0),  stop  and  conclude  that  S  is  unsatisfiable.  Otherwise  add  the 
resolvent  to  S  and  repeat.  The  proof  that  multivalent  resolution  is  a  sound  and 
complete  refutation  method  is  parallel  to  that  for  ordinary  resolution.  A  weaker 
result  (Theorem  4,  below)  will  suffice  for  present  purposes. 


5  Solving  the  Dual  via  the  Primal 

It  is  possible  in  general  to  characterize  a  primal  method  for  solving  a  problem 
as  one  that  examines  possible  values  of  the  variables,  and  a  dual  method  a  one 
that  examines  possible  proofs  of  optimality  without  interpreting  the  variables.  In 
classical  optimization,  a  branch-and-bound  method  is  a  primal  method,  whereas 
a  pure  cutting  plane  method  is  essentially  a  dual  method  (see  [8]  for  background). 
Primal  methods  have  generally  proved  more  effective  for  optimization,  although 
primal  and  dual  methods  are  often  combined,  as  in  branch-and-cut  and  dual 
ascent  algorithms. 

Fortunately,  the  inference  dual  of  a  discrete  optimization  problem  can  be 
solved  by  examining  the  results  of  a  primal  method,  in  particular  an  enumeration 
tree  that  is  generated  to  solve  the  primal  problem.  In  fact,  this  approach  yields 
both  a  complete  refutation  method  for  the  type  of  inference  used  in  the  dual 
and  an  algorithm  for  constructing  a  proof  of  optimality.  It  will  be  seen  that  a 
refiitation  method,  as  opposed  to  a  full  inference  method,  suffices  for  sensitivity 
analysis. 

The  most  straightforward  way  to  solve  (1)  by  enumeration  is  to  branch  on 
values  of  the  variables  until  all  feasible  solutions  have  been  found.  The  search 
backtracks  whenever  a  feasible  solution  is  found  or  some  constraint  is  violated. 
The  best  feasible  solution  is  optimal.  A  generic  algorithm  for  generating  the 
search  tree  appears  in  Fig.  1.  The  development  below  is  readily  modified  to 
accommodate  search  algorithms  that  use  bounding  and  other  devices  for  pruning 
the  tree. 

Let  2:*  be  the  optimal  value  found  by  enumeration.  To  solve  the  dual,  first 
modify  the  enumeration  tree  so  that  it  refutes  the  claim  that  f{x)  <  z*.  This 
is  done  simply  by  adding  the  constraint  f(x)  <  zt  to  the  constraint  set  C  for 
each  node  t  at  which  a  feasible  solution  is  found,  where  Zt  is  the  value  of  that 
solution.  Then  f{x)  <  zt  can  be  regarded  as  the  constraint  violated  at  node  t. 
There  is  now  at  least  one  violated  constraint  at  every  leaf  node. 
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Let  C  be  the  set  of  constraints  that  define  the  feasible  set  S  in  (1)  . 

Let  L  be  a  list  of  active  nodes,  initially  containing  the  root  node, 
which  is  associated  with  an  empty  set  of  assignments  (labels) . 

Let  z  be  an  upper  bound  on  the  optimal  value;  initially  z  =  oo. 

While  L  is  nonempty: 

Remove  a  node  from  L,  and  let  A  be  the  set  of  assignments 
associated  with  the  node. 

If  the  assignments  in  A  violate  no  constraint  in  C  then 

If  A  assigns  values  to  every  variable  so  as 

to  satisfy  every  contraint  in  C  (or  the  assignments  in  A 
can  be  readily  extended  to  all  variables  so  as  to  satisfy 
every  constraint)  then 

Let  z  =  min{z,/(vi, . . . 

Else 

Choose  a  variable  Xj  that  A  does  not  assign  a  value. 

For  each  t;  €  Dx^ : 

Add  a  node  to  L  and  associate  it  with  A\J  {xj  ==?;}. 

If  z  <  oo  then  z  is  the  optimal  value  of  (1)  . 

Else  (1)  is  infeasible. 


Fig.  1.  A  generic  enumeration  aJgorithm  for  solving  the  primal  problem. 


For  any  leaf  node  t  let  (xj^ , . . . ,  Xj^)  =  (i/j , . . . ,  Vd)  be  the  assignments  made 
along  the  path  from  node  t  to  the  root.  As  just  noted,  these  assignments  violate 
some  Ct  €  C.  Because  Ct  is  equivalent  to  the  conjunction  of  all  multivalent 
clauses  it  implies,  {xj^ .,Xj^)  =  {vi, . .  .,Vd)  violates  some  multivalent  clause 
Mt  implied  by  Ct-  Without  loss  of  generality  can  be  assumed  to  contain  only 
variables  in  {xj^ .,Xj^}.  The  enumeration  tree  is  now  a  refutation  of  /\^  Mf. 
D\ie  to  the  following  theorem,  the  tree’s  structure  indicates  how  to  construct  a 
resolution  proof  of  -i  /\^  Mf. 

Theorem  4.  Consider  an  enumeration  tree  in  which  a  multivalent  clause  Mt  is 
associated  with  each  node  t.  Let  the  clause  associated  with  any  nonleaf  node  be 
the  resolvent  on  xj  of  those  associated  with  the  node’s  children,  where  Xj  is  the 
variable  on  which  the  tree  branches  at  node  i.  Then  if  Mt  is  falsified  at  each  leaf 
node  t,  the  clause  associated  with  the  root  node  is  empty. 

Proof  It  suffices  to  show  that  the  clause  associated  with  the  root  node  is 
falsified,  because  only  the  empty  clause  is  falsified  when  no  variables  are  fixed. 
This  can  be  proved  by  induction.  It  is  given  that  Mt  is  falsified  at  any  leaf  node 
t.  Now  consider  any  nonleaf  node  k,  and  let  nodes  1, ...  ,p  be  its  children.  By  the 
induction  hypothesis,  Mi , . . . ,  Mp  are  falsified.  Suppose  that  the  search  branches 
on  variable  xj  at  node  k  and  that  xj  =  Vg  is  the  assignment  that  generates  child 
node  s.  Then  Mg  cannot  contain  a  literal  xj  €  Tgj  with  Vg  6  Tgj.  If  it  did.  Mg 
would  be  true  at  node  s.  Therefore  0^=1  '^sj  ~  0?  which  means  that  the  resolvent 
Mk  of  {Ml, . . . ,  Mp}  does  not  contain  xj.  So  Mk  is  falsified  at  node  k.  It  follows 
by  induction  that  the  root  clause  is  falsified.  □ 
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A  refutation  of  f{x)  <  z*  can  therefore  be  constructed  from  two  ingredi¬ 
ents:  multivalent  resolution,  and  an  algorithm  for  checking  whether  a  constraint 
implies  a  multivalent  clause  that  is  violated  by  a  given  variable  assignment 
{xji , . . . ,  Xj^)  =  .  Vm)‘  The  construction  proceeds  as  follows.  For  each  leaf 

node  t  of  the  enumeration  tree,  select  any  constraint  Ct  violated  at  that  node; 
if  the  node  represents  a  feasible  solution  with  value  zt,  let  Ct  be  f(x)  <  Zf.  For 
each  t  identify  a  multivalent  clause  Mt  that  a)  is  implied  by  (7f ,  b)  contains  only 
variables  among  those  that  are  fixed  along  the  path  from  node  t  to  the  root,  and 
c)  is  violated  at  node  t.  Then  the  desired  refutation  infers  each  Mt  from  Ct  and 
then  refutes  /\^  Mt  using  multivalent  resolution,  as  indicated  in  Theorem  4. 

Sensitivity  analysis  now  consists  of  observing  the  role  of  the  constraints  in  the 
refutation  just  constructed.  Some  specific  results  can  be  obtained  by  applying 
the  three  principles  below.  Let  a  node  be  feasible  if  no  constraints  in  C  are 
violated  at  the  node,  and  infeasible  otherwise.  The  principles  do  not  require 
that  the  resolution  refutation  actually  be  constructed;  only  that  the  clauses  Mt 
at  infeasible  nodes  and  the  values  zt  at  feasible  nodes  be  saved.  Because  the 
search  tree  may  have  a  very  large  number  of  leaf  nodes,  it  is  practical  to  save 
only  the  distinct  MtS  associated  with  a  given  constraint.  So  for  any  constraint 
C  e  C  let  {Mt  \  t  £  M{C)}  be  the  set  of  distinct  clauses  Mt  for  which  C  ~  Ct- 
The  number  of  feasible  nodes  is  likely  to  be  small,  however,  and  it  is  practical 
to  keep  a  list  of  zt  for  each  feasible  node  t  as  well  as  which  variables  are  fixed 
(and  to  what  values  they  are  fixed)  at  t. 

(51)  If  a  constraint  in  C  is  not  associated  with  any  leaf  nodes,  then  it  is  redun¬ 
dant  and  can  be  dropped  without  affecting  the  optimal  value. 

(52)  More  generally,  if  a  constraint  C  £C  is  replaced  with  a  constraint  C  that 
still  implies  Mt  for  all  t  £  M((7),  then: 

i)  the  optimal  value  will  not  decrease; 

ii)  the  optimal  value  will  be  at  least  mintg/{zt},  where  I  is  the  set  of  feasible 
nodes  at  which  C"  is  not  violated. 

(53)  Still  more  generally,  if  a  constraint  C  £  C  is  replaced  with  C",  then  the 
optimal  value  will  be  at  least 


where  I  is  as  above,  and  r  is  the  set  of  nodes  t  £  M(C)  at  which  C'  does 
not  imply  Mt.  Also  ^  is  the  minimiim  value  of  the  objective  function  f{x) 
subject  only  to  -»Mt.  That  is, 

z^  =  min{f{x)  |  Xj  €  \  Ttj  for  all  xj  in  Mt}. 

These  principles  can  clearly  be  adapted  to  analyze  the  effect  of  changing  two  or 
more  constraints  simultaneously. 
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6  0-1  Duality  and  Sensitivity  Analysis 

A  0-1  linear  programming  problem  may  be  stated, 

min  cx  (6) 

s.t.  Ax  >a 

xe{0Ar. 


The  inference  dual  is, 


max  (7) 

s.t.  Ax  >  a  <”-4”  CX  >  z. 

A  separation  lemma  for  this  dual  would  consist  of  a  complete  inference  method 
for  linear  0-1  inequalities.  Such  a  method  was  presented  in  [5].  Only  a  complete 
refutation  method  is  required  for  present  purposes,  however,  and  it  can  obtained 
as  indicated  in  the  previous  section. 

A  refutation  method  is  obtained  by  combining  multivalent  resolution  with 
a  method  for  checking  whether  a  0-1  inequality  implies  a  multivalent  clause 
that  is  falsified  by  a  given  variable  assignment.  Because  the  variables  in  this  case 
are  bivalent,  multivalent  clauses  become  ordinary  clauses.  Multivalent  resolution 
therefore  reduces  to  ordinary  resolution.  It  is  straightforward  to  check  whether 
ax  >  a  implies  a  clause  that  is  falsified  by  (xj^ , . . . ,  xj^)  =  , . . . ,  Let  J 

be  the  set  of  indices  j  in  {ii, . . .  ,jd}  for  which  aj  >  0  if  =  1  and  aj  <  0  if 
Vj  —  0.  Let  Xj{xi)  be  Xj  if  ?;  =  1  and  -^Xj  if  v  —  0.  Then  \l —  vj)  is  the 
desired  clause  if  T  a^}  <  a,  and  otherwise  there  is  no 

such  clause. 

Consider  for  example  the  problem, 

min  7xi  -h  6x2  + 

s.t.  2aTi-f-5x2—  3:3  >  3  (a) 

-Xi  +  X2+  4x3  >  4  (b)  (8) 

+  X2  +  3:3  >  2  (c) 

xe{0.ip 

The  enumeration  tree  of  Fig.  2  solves  the  problem.  The  optimal  solution  value 
is  8. 

A  resolution  proof  that  z  >  S  can  be  reconstructed  in  a  way  very  similar  to 
that  presented  above  for  satisfiability  problems.  At  each  leaf  node,  one  of  the 
violated  inequalities  is  chosen  as  a  premise;  in  this  case,  constraints  (a)  and  (b) 
suffice.  If  the  leaf  node  represents  a  feasible  solution,  the  premise  consists  of  the 
constraint  cx  <  z  -  1.  These  combined  premises  must  lead  to  a  contradiction, 
thus  proving  that  z  >  S. 

The  contradiction  can  be  demonstrated  by  resolution,  as  depicted  in  Fig.  3. 
The  inequalities  at  nodes  8  and  9,  for  instance,  respectively  imply  clauses  that 
resolve  to  obtain  -txi  V  -1x2  at  node  4.  The  latter  resolves  with  X2,  implied  by 
the  inequality  at  node  5,  to  obtain  -ixi  at  node  2. 
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(a),(b),(c) 


Fig.  2.  A  solution  of  the  subproblem  by  branching.  The  constraints  remaining  (i.e.,  not 
yet  satisfied)  at  exich  nonleaf  node  are  indicated.  The  constraints  violated  at  each  leaf 
node,  if  any,  are  indicated  with  an  asterisk;  if  no  constraints  are  violated,  the  objective 
function  value  z  is  shown. 


The  inequality  at  node  10  implies  a  second  clau.se  -^xi  V  -1X3  (not  shown  in 
Fig.  3)  that  resolves  with  xz  at  node  11.  But  this  clause  can  be  neglected  because 
it  is  not  falsified  by  variables  fixed  between  nodes  10  and  the  root. 

Because  the  clauses  inferred  at  the  leaf  nodes  are  falsified  by  the  fixed  vari¬ 
ables,  the  enumeration  tree  proves  they  are  imsatisfiable.  So  there  is  a  resolution 
proof  of  unsatisfiability.  In  this  case,  the  empty  clause  is  generated  below  the 
root,  at  node  3. 

The  three  principles  (S1)-(S3)  cited  earlier  for  carrying  out  sensitivity  anal¬ 
ysis  are  readily  applied  to  this  0-1  programming  example. 


(51)  Because  constraint  (c)  is  associated  with  no  leaf  node  in  Fig.  3,  it  is  re¬ 
dundant  and  can  be  dropped  without  changing  the  optimal  solution. 

(52)  Constraint  (a)  is  associated  with  leaf  nodes  5  and  7.  Both  M5  and  My  are 
the  singleton  clause  X2,  and  so  one  can  set  M{C)  =  {5}. 

i)  Constraint  (a)  can  be  altered  in  any  fashion  such  that  it  continues  to 
imply  0^2  —  1,  without  reducing  the  optimal  value.  For  instance,  the 
right-hand  side  can  be  reduced  to  any  number  greater  than  2  and  in¬ 
creased  arbitrarily.  The  coefiicient  of  X2  can  changed  arbitrarily  (if  the 
inequality  becomes  infeasible,  it  implies  X2  as  well  as  any  other  clause). 
The  coefficient  of  xi  can  be  increased  to  any  number  less  than  3  and  re¬ 
duced  arbitrarily,  and  so  forth.  Constraint  (b)  can  be  similarly  analyzed. 
It  is  not  hard  to  write  general  procedures  for  this. 
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Node  1 


Node  2 

—iXi 


Node  3 


0 


Node  4 

Node  5 

Node  6 

-■Xi 

V  -1X2 

2xi  -h  5x2  —  X3  >  3  (a) 

-1X9  -  X3 

(®2) 

"'^Node  9  / 

\ 

-Xi  -f  X2  +  4X3  >  4  (6)  / 

Node  11 

Node  8 

(^3)  / 

— xi  -f  X2  -1-  4x3  >  4  (6) 

7xi  ~  5x2 

—  3X3  >  - 

-14  / 

(2^3) 

(“ixi  V  -1X2  V  -1X3) 

Node  10 

—7xi  ~~  5x2  —  3x3  >  —7 


(-1X2  V  -1X3) 


3 


(a) 


Fig.  3.  Construction  of  a  proof  of  z  >  8.  The  violated  constraint  at  each  leaf  node 
is  shown,  along  with  a  falsified  clause  it  implies.  Resolvents  are  shown  at  the  nonleaf 
nodes. 


ii)  Suppose  the  right-hand  side  of  constraint  (a)  is  raised  to  5.  Of  the  two 
feasible  nodes  8  and  10,  (a)  is  now  violated  at  10.  The  new  optimal  value 
is  therefore  at  least  Zs  =  15. 

(S3)  Suppose  constraint  (a)  is  weakened  by  changing  it  to  2xi  -h  5x2  —  3^3  >  2, 
so  that  it  no  longer  implies  x-z  (I'  =  {5})-  Constraint  (a)  of  course  remains 
unviolated  at  the  feasible  nodes  8  and  10  (/  =  {8, 10}).  So  the  optimal  value 
is  at  least 

min  {min{z8, 2^10}}  1.5}  =0, 

where  ^  =  min{7xi  -I-  5x2  +  3x3  |  “'2^2}  =  0.  In  this  case  no  useful  bound  is 
obtained.  Constraint  (b)  can  be  similarly  analyzed. 

7  Practical  Considerations 

The  style  of  sensitivity  analysis  proposed  here  must  be  adapted  to  the  problem 
context.  The  interests  of  practitioners  should  guide  which  questions  are  asked, 
and  these  questions  should  guide  the  type  of  perturbations  that  are  studied.  Even 
in  classical  linear  programming,  sensitivity  analysis  can  provide  information  that 
is  too  complex  and  voluminous  to  be  assimilated.  One  must  take  care  to  highlight 
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the  results  that  are  intelligible  and  relevant.  At  this  point  it  is  impossible  to 
predict  the  problem  classes  in  which  inference  duality  can  yield  useful  sensitivity 
results.  Only  practical  trials  can  resolve  this  issue. 

One  possible  impediment  to  the  interpretation  of  sensitivity  analysis  is  “mas¬ 
sive  degeneracy.”  There  may  be  a  large  number  of  dual  solutions,  each  giving 
rise  to  a  different  sensitivity  analysis.  Fortunately,  a  single  dual  solution  can 
show  a  constraint  to  be  unimportant.  For  instance,  if  a  constraint  is  redundant 
in  one  dual  solution,  then  it  is  redundant  simpliciter,  in  the  sense  that  it  can  be 
dropped  without  changing  the  solution.  Or  if  one  dual  solution  yields  an  upper 
bound  on  the  effect  of  a  problem  alteration,  this  bound  is  Vcilid  in  general.  But 
to  establish  categorically  that  a  constraint  is  important^  one  must  show  that  it 
is  important  in  all  dual  solutions. 

The  effects  of  degeneracy  can  be  ameliorated  somewhat.  Degeneracy  has  two 
sources:  many  different  search  trees  arrive  at  the  same  optimal  value,  and  a  given 
search  tree  can  be  analyzed  in  many  different  ways.  The  first  source  must  be 
accepted,  because  it  is  usually  impractical  to  solve  a  problem  more  than  once. 
However,  if  the  role  of  a  few  particular  constraints  are  of  interest,  the  search 
tree  can  be  analyzed  with  this  in  mind.  These  constraints  should  be  avoided, 
whenever  possible,  when  associating  violated  constraints  with  leaf  nodes.  More 
generally,  the  multivalent  clauses  derived  from  violated  constraints  should  be  as 
weak  as  possible.  Ultimately,  however,  degeneracy  must  be  viewed  as  inherent 
in  the  nature  of  things  rather  than  an  artifact  of  the  analysis.  It  is  often  an 
unavoidable  fact  that  different  rationales  can  be  given  for  an  optimal  solution, 
each  drawing  on  different  information  in  the  constraint  set. 

Although  the  results  presented  here  apply  only  to  problems  that  are  solved 
by  tree  search,  it  is  rare  that  a  (verified)  optimal  solution  is  found  by  other 
means  in  discrete  problems.  As  noted  earlier,  the  analysis  can  be  modified  to 
accommodate  tree  searches  that  use  such  devices  as  bounding  and  relaxations 
to  prune  the  tree. 

It  is  also  common  for  applications  to  require  both  discrete  and  continuous 
variables.  It  is  an  interesting  research  issue  as  to  how  the  classical  methods 
for  continuous  sensitivity  analysis  may  be  combined  with  the  discrete  methods 
presented  here. 
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Abstract.  ‘Constraint  hierarchy’  is  a  nonmonotonic  system  that  allows 
programmers  to  describe  over-constrained  real-world  problems  by  speci¬ 
fying  constraints  with  hierarchical  preferences,  and  has  been  applied  to 
various  areas.  An  important  aspect  of  constraint  hierarchies  is  the  exis¬ 
tence  of  efficient  satisfaction  algorithms  based  on  local  propagation.  How¬ 
ever,  past  local-propagation  algorithms  have  been  limited  to  multi-way 
equality  constraints.  We  overcome  this  by  reformulating  constraint  hier¬ 
archies  with  a  more  strict  definition,  and  proposing  generalized  local  prop¬ 
agation  as  a  theoretical  framework  for  studying  constraint  hierarchies 
and  local  propagation.  Then,  we  show  that  global  semi-monotonicity  in 
satisfying  hierarchies  turns  out  to  be  a  practically  useful  property  in 
generalized  local  propagation.  Finally,  we  discuss  the  relevance  of  gener¬ 
alized  local  propagation  with  our  previous  DETAIL  algorithm  for  solving 
hierarchies  of  multi-way  equality  constraints. 


Keywords:  constraint  hierarchies,  nonmonotonicity,  local  propagation,  multi-way  con¬ 
straints. 


1  Introduction 

Constraint  hierarchies  allow  programmers  to  describe  over-constrained  real- 
world  problems  by  specifying  constraints  with  hierarchical  strengths  or  pref¬ 
erences  [1,  2],  and  have  been  applied  to  various  research  areas  such  as  constraint 
logic  programming  [11,  13],  constraint  imperative  programming  [3],  and  graph¬ 
ical  user  interfaces  [8,  9).  Intuitively,  in  a  constraint  hierarchy,  the  stronger  a 
constraint  is,  the  more  it  influences  the  solutions  of  the  hierarchy.  For  example, 
the  hierarchy  of  the  constraints  strong  x  =  0  and  weak  x  =  \  yields  the  solution 
X  0.  This  property  enables  programmers  to  specify  preferential  or  default  con¬ 
straints  that  may  be  used  in  case  the  set  of  required  or  non-default  constraints 
are  under-constrained.  Moreover,  constraint  hierarchies  are  general  enough  to 
handle  powerful  constraint  systems  such  as  arithmetic  equations  and  inequali¬ 
ties  over  reals.  Additionally,  they  allow  ‘relaxing’  of  constraints  with  the  same 
strength  by  applying,  e.g.,  the  least-squares  method. 
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Theoretically,  the  key  property  of  constraint  hierarchies  is  nonmonotonicity. 
That  is,  addition  of  a  new  constraint  to  an  existing  hierarchy  may  change  the 
set  of  solutions  completely,^  while  in  ordinary  monotonic  constraint  systems, 
it  would  either  preserve  or  reduce  the  solution  set.  For  instance,  if  we  add  the 
constraint  strong  a:  =  0  to  the  hierarchy  of  weak  a;  =  1,  the  solution  will  change 
from  a:  ^  1  into  a;  0.  Clearly,  this  nonmonotonic  property  gives  us  the  power 
to  specify  default  constraints. 

An  important  aspect  of  constraint  hierarchies  as  a  nonmonotonic  system  is 
that  there  are  efficient  satisfaction  algorithms  proposed.  We  can  categorize  them 
into  the  following  two  approaches: 

The  refining  method  first  satisfies  the  strongest  level,  and  then,  weaker  levels 
successively.  It  is  employed  in  the  DeltaStar  algorithm  [11]  and  a  hierarchical 
constraint  logic  programming  language  CHAL  [10]. 

Local  propagation  gradually  solves  hierarchies  by  repeatedly  selecting 
uniquely  satisfiable  constraints-  It  is  mainly  used  in  constraint  solvers  for 
graphical  user  interfaces  such  as  DeltaBlue  [4],  SkyBlue  [9],  and  DETAIL  [6]. 

First,  to  illustrate  the  refining  method,  suppose  we  have  a  hierarchy  consist¬ 
ing  of  required  x  =  y,  strong  y  =  z  +  l^  medium  z  =  0,  and  weak  z  =  1.  This  is 
solved  as  follows:  first,  by  satisfying  the  strongest  constraint  required  x  =  y,  the 
method  reduces  the  set  0  of  all  variable  assignments  (mappings  from  variables 
to  their  values)  to  {0  E  ©  |  0(x)  =  ^(2/)};  second,  by  fulfilling  the  next  strongest 
one  strong  y  ~  z-\-l,we  obtain  E  ©  |  0(x)  =  6{y)  A  0{y)  =  0(z)  -H  1};  third, 
evaluating  medium  z  =  0  yields  {0  E  ©  |  0{x)  =  1  A  0{y)  =  1  A  6(z)  =  0};  now, 
the  weakest  constraint  weak  z  =  1  conflicts  with  the  assignments  that  have  been 
generated  from  the  stronger  constraints,  and  therefore,  remains  unsatisfied.  As 
shown  in  this  example,  the  refining  method  is  a  ‘straightforward’  algorithm  for 
solving  constraint  hierarchies. 

Next,  to  demonstrate  local  propagation,  reconsider  the  hierarchy  in  the  last 
example.  Local  propagation  handles  it  as  follows:  first,  since  medium  z  =  0  can 
be  uniquely  solved,  it  acquires  {0  E  ©  |  6{z)  =  0};  next,  since  the  instantiation 
of  2:  makes  strong  y  =  z  +  l  uniquely  satisfiable,  it  produces  E  ©  |  0{y)  = 
1  A  0(z)  =  0};  finally,  computing  required  x  =  y,  it  outputs  E  ©  |  6{x)  = 
1  A  d(y)  =  1  A  6{z)  =  0}.  Note  it  must  reject  the  weakest  constraint  weak  z  =  1 
at  the  beginning;  otherwise,  it  would  yield  an  incorrect  or  empty  solution.  As 
suggested  with  this  example,  local-propagation  algorithms  must  plan  in  what 
order  they  will  choose  and  solve  constraints,  discarding  the  ones  that  lead  to 
incorrect  solutions. 

Local  propagation  takes  advantage  of  the  potential  locality  of  typical  (pos¬ 
sibly,  non-hierarchical)  constraint  networks  in  graphical  user  interfaces.  Basi¬ 
cally,  it  is  efficient  because  it  uniquely  solves  a  single  constraint  in  each  step. 
In  addition,  when  a  variable  value  is  repeatedly  updated  by  an  operation  such 

^  Wilson  and  Doming  refer  to  the  property,  in  a  less  familiar  word,  as  ‘disorderly’  [12]. 
Instead,  they  use  nonmonotonicity  for  another  concept  in  hierarchical  constraint 
logic  programming. 
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as  dragging  in  interactive  interfaces,  it  can  easily  re-evaluate  only  the  necessary 
constraints.  However,  local  propagation  has  been  restricted  to  multi-way  equality 
constraints  which  can  be  uniquely  solved  for  each  variable,  e.g.  linear  equations 
over  reals.  Also,  it  cannot  find  multiple  solutions  for  a  given  constraint  hierarchy 
due  to  the  uniqueness. 

Naturally,  a  question  arises  whether  we  can  ‘generalize’  local  propagation  to 
solve  hierarchies  of  more  powerful  constraints  without  losing  its  efficiency.  In 
this  research,  we  first  reformulate  the  constraint  hierarchy  theory,  and  then  in¬ 
troduce  a  property  of  constraint  systems  called  global  semi-monotonicity  ^  which 
is  weaker  than  monotonicity  but  not  disordered  nonmonotonicity.  Next,  we  pro¬ 
pose  generalized  local  propagation,  a  theoretical  framework  for  investigating  local 
propagation  on  constraint  hierarchies,  and  show  that  global  semi-monotonicity 
exhibits  a  practically  useful  property  in  generalized  local  propagation.  Finally, 
to  illustrate  the  utilization  of  GLP,  we  relate  it  with  our  previous  DETAIL  al¬ 
gorithm  for  multi-way  equality  constraints  that  can  be  simultaneously  solved  or 
properly  relaxed  [6]. 


2  Related  Work 

This  section  briefly  overviews  previous  researches  on  nonmonotonic  constraint 
systems  from  the  viewpoint  of  local  propagation. 

Horning  et  al.,  the  originators  of  constraint  hierarchies  [1],  studied  properties 
of  hierarchies  [2,  12],  and  also  developed  local-propagation  algorithms  called 
DeltaBlue  [4]  and  SkyBlue  [9] .  However,  their  research  on  theoretical  properties 
did  not  cover  local  propagation  on  constraint  hierarchies,  but  rather  mainly 
focused  on  hierarchical  constraint  logic  programming  (HCLP)  [11,  12,  13]. 

Jampel  constructed  a  certain  HCLP  instance  that  separates  the  HCLP 
scheme  into  compositional  and  non-compositional  parts  [7].  The  method  is  ex¬ 
pected  to  improve  the  efficiency  of  interpreters  and  compilers  since  the  compo¬ 
sitional  part  is  efficiently  implementable.  However,  it  is  unclear  whether  such  a 
method  is  applicable  to  local  propagation. 

Freuder  and  Wallace  proposed  paxtial  constraint  satisfaction  for  handling 
constraint  satisfaction  problems  that  are  impossible  or  impractical  to  solve  [5]. 
Theoretically,  it  is  general  enough  to  simulate  constraint  hierarchies.  However, 
the  presented  algorithms  were  ones  searching  for  approximate  solutions  by  ‘weak¬ 
ening’  problems  over  finite  domains. 


3  Generalized  Local  Propagation 

In  this  section,  we  first  reformulate  constraint  hierarchies,  and  then  introduce 
global  semi-monotonicity  of  constraint  hierarchies.  Next,  we  generalize  local 
propagation  on  constraint  hierarchies,  and  study  its  properties  for  obtaining 
correct  solutions. 
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3.1  A  Reformulation  of  Constraint  Hierarchies 

Before  generalizing  local  propagation,  we  modify  the  original  formulation  of 
constraint  hierarchies  in  [2]  so  that  it  will  allow  us  to  better  investigate  local 
propagation.  Intuitively,  the  main  changes  are  to  explicitly  parameterize  tar¬ 
get  hierarchies,  and  to  replace  concrete  embedded  functions/relations  with  ab¬ 
stract  ones  satisfying  recisonable  conditions.  First,  we  define  basic  terms  and 
symbols.  Let  X  be  the  set  of  all  variables,  D  the  domain  of  the  variables,  and 
C  the  set  of  all  constraints.'^  Given  a  constraint  c,  X{c)  denotes  the  set  of 
all  the  variables  constrained  by  c,  and  given  a  set  C  of  constraints,  we  define 
X{C)  =  {x  €  -X"  I  3c  G  C.  X  €  JC(c)}.  A  strength  of  a  constraint  is  an  integer  / 
such  that  0  <  I  <w,  where  w  is  some  positive  integer.  Intuitively,  the  larger  the 
integer  is,  the  weaker  the  strength  is.  Let  L  be  the  set  of  all  the  strengths.  A  con¬ 
straint  c  with  a  strength  I  is  represented  by  c/l.  A  constraint  hierarchy  is  a  finite 
set  H  of  constraints  with  strengths,  and  H  expresses  the  set  of  all  constraint 
hierarchies.  For  brevity,  we  write  a  variable  as  x,  a  constraint  as  c,  a  strength 
as  /,  and  a  constraint  hierarchy  as  iJ,  possibly  with  primes  or  subscripts. 

To  represent  solutions  to  constraint  hierarchies,  we  use  variable  assignments. 
A  variable  assignment,  denoted  as  is  a  mapping  from  X  to  D,  and  G  indicates 
the  set  of  all  variable  assignments.  Given  a  set  X  of  variables,  we  define  9(X)  = 

e\x)  s^yxex.e{x)  =  e\x). 

To  assign  semantics  to  constraints,  we  first  introduce  error  functions  in  the 
same  manner  as  the  original  formalization  of  constraint  hierarchies  [2]: 

Definition  1  (error  function) .  An  error  function  for  I  is  a  mapping  ei  :  C  x 
0  {0}  U  R+  such  that  for  any  c,  6,  and  O',  0{X{c))  =  0'iX{c))  ^  ei(c,0)  = 

ei{c,0'). 


Intuitively,  ei{c,0)  indicates  the  error  of  c/l  under  0,  which  is  zero  if  c/l  is 
exactly  satisfied,  and  positive  otherwise.  The  condition  requires  that  errors  of 
a  constraint  under  two  variable  assignments  are  equal  if  the  assignments  have 
equal  values  for  each  constrained  variable. 

Next,  we  introduce  level  comparators: 


Definition  2  (level  comparator).  A  level  comparator  for  /  is  a  ternary  rela¬ 
tion  <:H  xGxG  such  that  for  any  iJ,  H’,  0,  O',  and  0",^ 


Vc  G  C.  {c/l  eH^c/le  H') 
^c/leH.ei{c,e)=ei{c,e") 
yc/l  e  H,ei{c,0')  ^  ei{c,e") 
\fc/l  G  H.ei{c,6)  <  ei{c,0') 


H’fl 

0 

<  O') 

(1) 

=>  {9  <  9‘  ^9' 

'  <  O') 

(2) 

H/l 

H/l 

z^{9  <  9' ^9 

<  0") 

(3) 

=i>9  <  9' 

(4) 

We  simply  define  variables  and  constraints  as  elements  in  the  corresponding  sets, 
and  separately  provide  their  semantics  using  certain  functions  and  relations. 

®  When  we  write  Vc//  G  if,  we  mean  that  the  universal  quantifier  V  is  associated  only 
with  c.  In  other  words,  /  is  either  free  or  quantified  by  another  preceding  one. 
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Hjl  Hfl 

e  <  6’  ^e*  <  e” 

H/i  H'  n 

e  <  & nq  <  e* 


HU 

6  <  e” 

H\JH’ /I 

9  <$•  . 


(5) 

(6) 


Hll 


Intuitively,  0  <  6'  means  is  better  than  or  similar  to  6'  according  to  I  of  iJ.” 
Conditions  (l)-(3)  say  that  the  scope  of  a  level  comparator  is  restricted  to  be 
inside  a  designated  level.  Condition  (4)  indicates  that  if  errors  of  all  constraints 
at  a  level  under  an  assignment  are  smaller  than  or  equal  to  those  under  another 
assignment,  then  the  former  assignment  is  better  than  or  similar  to  the  latter 
according  to  the  level.  Condition  (5)  is  ‘transitivity’  of  a  level  comparator.  Condi¬ 
tion  (6)  means  that  if,  in  two  hierarchies,  an  assignment  is  better  than  or  similar 
to  another  according  to  the  level,  then  the  relation  holds  in  the  combination  of 
the  hierarchies. 

For  convenience,  we  define  >  (worse  than  or  similar  to),  (similar  to),  < 

•//  7^  H/l 

(better  than),  >  (worse  than),  and  9^  (incomparable  with)  as  follows:  9  > 

H/l  jrj,  H/l  H/l  H/l  H/l  rrn 

9^  ^9‘  <  9-,9  ri'  9>  ^9  <  9'  A9  >  9';  9  <  9^  ^  9  <  9'  A^9  9'; 

H/l  H/l  rr,,  H/l  H/l  H/l 

9  >  9^^9  >  9‘ a^9  rt' e’',9  9'  <^^9  <  &  N -^9  >  9'. 

The  original  definition  of  level  comparators  is  quite  different  from  Definition  2 

7^  ‘/i  .// 

in  the  following  ways:  it  separates  <  into  <  and  and  defines  them  construc¬ 
tively;  it  includes  (l)-(3)  operationally;  it  seems  to  implicitly  assume  (4);  it  does 

not  require  the  transitivity  of  unlike  (5);  it  presents  no  condition  like  (6).  The¬ 
oretically,  the  greatest  difference  is  the  lack  of  the  transitivity  of  which  we 
will  discuss  later  in  Subsect.  3.4. 

A  useful  example  of  a  level  comparator  is  the  least-squares  level  comparator^ 
H/l 

defined  as  ^  <  9'  ^  Y^c/ien  e/(c,^)^  <  ^c/ien  e/(c,^')^-  Here,  two  variable  as¬ 
signments  are  compared  by  summing  squares  of  errors  of  constraints  at  the  level. 
It  is  easy  to  prove  that  the  definition  fulfills  the  conditions  for  level  compara¬ 
tors.  Used  in  satisfaction  of  constraint  hierarchies,  it  works  as  the  least-squares 
method  within  level  L 

Next,  we  define  constraint-hierarcy  comparators  that  totally  compare  hier¬ 
archies  by  combining  level  comparators: 

Definitions  (constraint-hierarcy  comparator).  A  constraint-hierarcy  com¬ 
parator  is  a  ternary  relation  <:  H  x  0  x  &  such  that  for  any  H,  9,  and  9\ 

e<6' €L.i'  e')Ae^<  O'. 

Intuitively,  9  <  9'  means  is  better  than  9'  according  to  /f.”  It  is  defined  as 
lexicographic  ordering  with  level  comparators  as  its  components.  Consequently, 
the  result  of  a  level  comparator  has  absolute  priority  over  those  of  weaker  ones. 

For  convenience,  we  define  >  (worse  than),  (similar  to),  <  (better  than  or 

similar  to),  >  (worse  than  or  similar  to),  and  /  (incomparable  with)  as  follows: 
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e  >  e'  ^  0’  <  0;  e  &  O'  ^  yi  e  L.9  0';  e  <  O'  e  <  e'  V  e  ”  e'-, 

0  >  0'  0  >  0'  y  0  S,  0';  g  ^  9'  ^  -,e  <  ff  A  -,0  >  0' _ 

The  following  definition  describes  the  satisfaction  of  constraint  hierarchies 
using  a  const raint-hierarcy  comparator: 

Definition  4  (constraint- hierarcy  satisfier).  A  constraint-hierarcy  satisfier 
is  a  mapping  S  :  2^  xH  ^  2^  defined  as  5(@,  H)  =  {9  e  O  \  ~>30'  e  0.6'  <  0}. 

As  a  shorthand,  we  write  S{H)  instead  of  5(0,  H).  Intuitively,  5(6>,  H)  is  the  set 
of  assignments  obtained  by  nonmonotonically  satisfying  H  in  O.  By  definition,  an 
assignment  in  S(0,  H)  is  an  element  in  O  such  that  there  is  no  better  assignment 
in  O  when  compared  according  to  H, 

Finally,  we  define  solutions  of  constraint  hierarchies: 

Definitions  (solution).  A  solution  to  if  is  a  variable  assignment  in  S{H). 

In  other  words,  a  solution  to  if  is  an  assignment  found  by  nonmonotonically 
satisfying  if  in  the  set  of  all  assignments. 

One  difference  between  the  original  and  our  formulations  in  defining  con¬ 
straint-hierarcy  comparators  is  that  the  original  restricts  top-level  constraints 
to  be  required,  whereas  ours  allows  conflicting  constraints  at  the  top  level.  This 
is  because  our  definition  of  hierarchy  satisfiers  excludes  the  special  treatment  of 
the  top  level.  However,  the  resulting  solutions  are  the  same  so  far  as  the  top 
level  is  not  over- constrained.  Also,  even  if  we  add  the  condition  for  the  top  level 
to  be  required,  we  can  easily  accommodate  it  in  our  following  proofs. 


3.2  Global  Semi-Monotonicity 

We  define  a  useful  property  called  global  semi-monotonicity  (GSM)  in  satisfying 
constraint  hierarchies  as  follows: 

Definition  6  (global  semi- monotonicity).  S  is  globally  semi-monotonic  iff 
for  any  H  and  H' ,  S{H)  n  S{H')  C  S{H  U  H'). 

GSM  requires  that  any  common  solution  to  two  constraint  hierarchies  is  also  a 
solution  to  their  combination.  It  is  not  only  natural  but  also  weak  (or  general) 
in  a  sense  that  the  condition  is  true  for  any  two  hierarchies  sharing  no  solutions. 

GSM,  by  definition,  is  not  limited  to  constraint  hierarchies.  In  a  similar 
style,  we  can  express  basic  properties  of  constraint  systems.  For  example,  we 
can  represent  ordinary  monotonicity  as  S{H)  n  S{H')  =  S{H  U  ff'),  where  the 
difference  from  GSM  is  that  it  has  S{H)OiS{H')  D  S{H  U  if').  Thus,  we  can  see 
that  GSM  lacks  the  familiar  style  of  the  monotonic  property,  S{H)  D  S{H  U  if'). 
(Such  a  universal  style  of  formal  properties  is  helpful  in  comparing  different 
nonmonotonic  systems.) 

We  present  a  useful  class  of  GSM  constraint-hierarcy  satisfiers  called  global 
constraint-hierarcy  satisfiers,  using  global  level  comparators  and  global  con¬ 
straint-hierarcy  comparators: 
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Definition  7  (global  level  comparator).  A  level  comparator  <  is  global  iff 
for  any  H,  0,  and  0'^ 


o’k'  O' A  e  '  O' 


H/l 

0  O' 


Hfl  , 

e  <  0’ 


H’fl 

6  <  e’ 


0 


HUH' /I 
<  0' 

HUH'/I 

0  < 


6' 


(7) 

(8) 
(9) 


Definitions  (global  constraint-hierzircy  comparator).  A  constraint-hier- 
arcy  comparator  is  global  iff  each  level  comparator  is  global. 

Definition  9  (global  constraint-hierarcy  satisfier).  A  constraint-hierarcy 
satisfier  is  global  iff  its  constraint-hierarcy  comparator  is  global. 


An  example  of  global  level  comparators  is  the  least-squares  level  comparator. 
Most  level  comparators  presented  in  the  original  formulation  are  also  global. 
The  following  theorem  proves  that  global  satisfiers  are  GSM: 


Theorem  10.  S  is  GSM  if  S  is  global. 

Proof.  By  contradiction:  Assume  that  there  exists  a  0  that  is  in  S(H)  and 

huh' 

but  not  in  S{H  U  i?').  Then,  for  some  $'  <  0  holds,  that  is,  for  some  i, 
(vr  eL.V  <1=^0'  ~  '  e)  AO'  <  0.  By  (7)  and  (8),  0'  -  ^  ^  implies 

H/l'  H' IV  HU'  H' n'  HIV  H' /V 

{O'  <  BaO'  >  B)y{e'  ^  OAO'  e)y{e'  >  BaB'  <  0),andby(9), 

HUH' /I  H/l  H'/l 

B'  <  B  implies  B'  <  By  B'  <  B.  Hence,  it  must  be  either  of  the  following 
two  cases: 

Case  31'  &L.I'  <l  A  (V/"  e  L.  I"  <  I'  ^  O'  "1^"  0  A  O'  "'i'"  0)  A  O'  0. 

H 

Then,  B'  <B  holds,  which  is  a  contradiction  to  ^  6  S{H). 

Case  31'  e  L.  I'  <l  A  {'01"  6  L.  I"  <  V  O'  0  AO'  ”  0)  A  O'  0. 
h' 

Then,  B’  <  B  holds,  which  is  a  contradiction  to  ^  G  S{H').  □ 


The  converse,  that  GSM  satisfiers  are  global,  is  not  true;  in  fact,  we  have  not 
found  weaker  conditions  for  level  comparators  that  yield  a  set  equivalent  to 
GSM.  However,  we  believe  that  most  useful  GSM  satisfiers  are  global.® 

®  Actually,  we  could  make  the  converse  true  if  we  strengthened  the  formulation  of  con¬ 
straint  hierarchies  by  allowing  only  ‘modular’  hierarchy  comparators  as  follows:  let 
level  comparators  be  in  a  certain  set  including  the  least-squares  level  comparator, 
and  also  let  hierarchy  comparators  need  to  be  arbitrarily  composed  of  level  com¬ 
parators  in  the  set.  For  modular  hierarchy  comparators,  the  truth  of  the  converse  is 
easily  provable  since  we  can  create  a  non-GSM  satisfier  by  combining  any  non-global 
and  the  least-squares  level  comparators.  Another  set  of  level  comparators  without 
the  least-squares  level  comparator  may  exist,  but  is  unlikely  to  be  more  useful. 
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Global  hierarchy  comparators  might  seem  strongly  related  to  globally-better 
comparators  in  the  original  formulation,  but  in  fact,  they  are  dilferent,  A 
globally-better  comparator  is  a  hierarchy  comparator  composed  of  level  com¬ 
parators  that  compare  reals  generated  by  combining  errors  of  constraints.  One 
instance,  least-squares-better,  is  composed  of  the  least-squares  level  comparators, 

and  therefore,  is  global.  However,  worst-case- better,  composed  of  the  worst- case 

H/l 

level  comparators  defined  as  ^  maxc/ze/f  e/(c,0)  <  maxc//e/f  is 

not  global  because  (7)  does  not  hold.  Generally,  for  level  comparators  of  globally- 

H/l 

better  comparators,  (8)  is  true  since  they  compare  reals,  i.e.  -^0  O’.  However, 

it  depends  on  actual  instances  of  level  comparators  whether  both  (7)  and  (9) 
hold. 


3.3  Generalized  Local  Propagation 

Classical  local  propagation  satisfies  a  constraint  network  by  successively  solving 
individual  constraints  in  an  order  closely  associated  with  the  network  topology. 
Here  we  generalize  local  propagation  so  that  it  can  solve  a  set  of  constraints 
in  one  step  and  can  also  introduce  an  arbitrary  order  among  such  constraint 
sets.  For  this  purpose,  we  introduce  ordered  partitions  as  follows:  a  partition 
of  a  constraint  hierarchy  is  a  set  generated  by  decomposing  the  hierarchy  into 
disjoint  subsets  called  blocks;  given  a  partition  P,  an  ordered  partition  of  P  is 
a  pair  (P,  <p),  where  <p  is  an  arbitrary  partial  order  among  blocks  in  P.  For 
brevity,  we  write  B  <p  B'  instead  of  P  <p  P'  A  P  B’ . 

Using  ordered  partitions  into  blocks,  we  define  generalized  local  propagation 
(GLP)  in  the  following  way: 

Definition  11  (generalized  local  propagation).  Generalized  local  propaga¬ 
tion  with  5  is  a  mapping  'Ks((P,  ^p))  defined  as  follows: 


r  0  if  |P|  =  0 

^s((P,  <p))=<  Pi  5'(7r5(before((P,  <p),P)),P)  otherwise  , 

^  Beterminais({P,<p)) 

where  terminals  and  before  are  as  follows: 


terminals{(P,  <p))  =  {B’  €  P  |  -• 
beforei{P,<p),B)  =  {P’,<p,)  | 


3  B"  €  P.B'  <p  B"} 

P'  =  {B'eP\  B'  <p  B} 

<p,  =  {{B',B")  e  P'  X  P'  I  S'  <p  B"}  . 


Intuitively,  teiminals{{P,  <p))  is  the  set  of  all  blocks  at  terminal  positions,  and 
before{(P,  <p),P)  is  the  ‘ordered  sub-partition’  of  (P,  <p),  where  all  blocks  are 
before  P.  For  example,  consider  the  ordered  partition  (P,  <p)  of  the  blocks  Pi, 
P2,  . . . ,  P9,  as  illustrated  in  Fig.  1.  The  partial  order  <p  is  defined  as  the  re¬ 
flexive  transitive  closure  of  all  the  arrows  in  Fig.  1.  Then,  terminals({P,<p)) 
is  the  set  {P8,P9}.  Also,  before({P,  <p},P9)  is  the  pair  consisting  of  the  set 
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before{{P,  <p>,  Bg) 

Fig.  1.  An  ordered  partition 


{Bi, 525^4,^5,  B7}  and  the  partial  order  defined  as  the  reflexive  transitive  clo¬ 
sure  of  the  black  arrows.  Thus,  Bg  is  satisfied  in  the  set  of  assignments  obtained 
by  applying  GLP  to  blocks  before  Bg .  Accordingly,  we  can  view  GLP  as  a  pro¬ 
cess  that  successively  solves  each  blocks  in  some  order  respecting  <p.  This  is 
always  possible  because  <p  is  a  partial  order. 

The  next  lemma  shows  that  by  using  a  global  satisfier,  GLP  respects  the  sim¬ 
ilarity  of  variable  assignments  for  ordered  partitions  that  satisfy  the  conditions 
below: 

Lemma  12.  Let  S  he  global.  Given  an  arbitrary  H,  (P,  <p)  of  H,  and  6 
in  7r5((P,  <p)),  then  any  9'  is  in  7rs{(P,  <p))  if  0*  ^  9  and 

VP  G  P.  Vc//  G  P.  ei(c,  61)  >  0  ^  VP'  G  P  B’  <p  B  ^  Vc'/Z'  e  B'.  I'  <  I  .  (10) 


Proof.  By  contradiction:  Assume  that  there  exists  some  0'  which  is  not 
in  7r5((P,  <p)).  Then,  it  is  necessary  that  for  some  Pi  in  P,  9'  is 
in  7r5(before((P,  <p),Pi)),  but  not  in  5(7r5(be/ore((P,  <p),Pi)),Pi).  Because 

9'  9  holds  and  S  is  global,  9  9'  does  not  hold.  Therefore,  9  < 

Bi /I 

must  hold,  that  is,  there  exists  some  li  such  that  (V/  G  X.  /  <  /i  ^  ^ 
Bi/h  Bfh 

9')  f\9  <  9’.  This  implies  that  for  some  P  in  P,  ^  >  9'  holds.  Since  9  must 

be  in  5(7r5(before((P,  <p},  P)),  P),  it  must  be  either  of  the  following  two  cases: 
Case  9’  G  7rs(before((P,  <p),P))  A^'  ^  5(7r5(before((P,  <p),P)),P).  Since 
Bfh  Bfh 

9  >  9'  holds,  there  must  exist  some  I2  such  that  I2  <  h  and  9  <  9'. 

Case  9’  ^  7r5(before((P,  <p),P)).  Then,  for  some  P'  in  P  such  that  B’  <p 
P,  9^  is  in  7r5(before((P,  <p),P')),  but  not  in  5(7r5(before({P,  <p),P')),P'). 
Bfh 

Since  9  >  9'  implies  3c//i  G  B.ei^(c,9)  >  0,  and  also  since  (10)  holds,  P' 
contains  only  stronger  constraints  than  h .  Therefore,  there  exists  some  I2  such 
B'/h 

that  I2  <  h  and  9  <  9'. 

Beginning  with  9  <  9’,  both  of  the  two  cases  resulted  in  that  there  exist 

B2/I2  ,  .  _  . 

some  I2  and  P2  such  that  I2  <  h  and  9  <  9  .  Clearly,  it  causes  an  infinite 
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sequence  such  that  k  >  k+i.  However,  since  each  k  is  a  non-negative 

integer,  it  is  a  contradiction.  □ 

Intuitively,  Lemma  12  says  that  if  GLP  using  a  global  satisfier  generates  a  vari¬ 
able  assignment  under  which  constraints  with  errors  have  only  stronger  con- 
straints  before  them,  then  it  yields  all  similar  (i.e.  assignments.  Note  that 
the  sufficient  condition  (10)  allows  constraints  without  errors  to  be  placed  after 
weaker  ones. 

In  the  following  theorem,  we  prove  that  such  variable  assignments  are  solu¬ 
tions  to  the  constraint  hierarchy: 

Theorem  13.  Let  S  he  global.  Given  an  arbitrary  H,  (P,  <p)  of  H,  and  6 
in  7r5((P,  <p)),  then  9  is  a  solution  to  H  if  (10)  holds. 

Proof.  By  induction  on  the  size  of  P: 

Induction  base.  If  \P\  =  0,  the  proposition  holds. 

Induction  step.  Assume  that  if  |P|  <  n,  the  proposition  holds.  Now,  let  |P|  = 
n.  For  any  B  in  terminals{{P^  <p)),  9  must  be  in  S{7rs{before{{P,  <p),P)),P). 
Therefore,  by  the  induction  hypothesis,  9  is  in  5'(JTp),  where  Hb  is  the  union 
of  blocks  of  before{(P^<p),B).  Now,  we  assume  (for  contradiction)  that  there 

Hb  US 

exists  some  9'  such  that  9*  <  9,  that  is,  for  some  /,  (V/'  e  L.V  <  I  ^ 

9  9)  A  9  <  9.  It  must  be  either  of  the  following  two  cases: 

Case  32'  eL.l'  <lA  (V2"  e  L.  I"  <  2'  =J>  (9'  OaO'  0)  A  0'  ^<!'  0. 

Hb 

Then,  9^  <  9  holds.  Therefore,  9  ^  S{Hb)^  which  is  a  contradiction. 

Case  32'  £  L.V  <  I  A  (V2"  e  i.2"  <  I'  0'  0  A  0'  0)  A  0' 

9.  Then,  for  some  cJV  in  P,  e//(c, 0)  >  0  must  hold.  By  (10),  Hb  contains 

only  stronger  constraints  than  V.  Therefore,  9'  ^  9  holds.  By  Lemma  12,  9' 

B 

is  also  in  7rs{before{(P,<p},B)).  However,  since  9'  <  9  holds,  it  implies  9  ^ 
S{'irs{before{(P,  <p),B)),B),  which  is  a  contradiction. 

Both  cases  caused  contradiction.  Therefore,  there  never  exists  such  9’,  i.e.  9 
is  in  S{Hb  U  B).  Since  S  is  global,  9  is  also  in  S(H)  by  Theorem  10.  □ 

The  theorem  presents  a  strategy  to  design  algorithms  for  solving  constraint  hier¬ 
archies.  As  noted,  the  sufficient  condition  permits  constraints  without  errors  to 
be  located  after  weaker  ones.  In  other  words,  we  can  delay  the  satisfaction  of  a 
strong  constraint  with  no  error  until  some  appropriate  time,  for  example,  “when 
the  constraint  becomes  uniquely  satisfiable.”  Actually,  Theorem  13  gracefully  ex¬ 
plains  why  the  DETAIL  algorithm  obtains  solutions  by  using  local  propagation, 
which  we  will  describe  in  Sect.  4. 

An  important  instance  of  such  GLP  is  the  refining  method.  Since  constraints 
have  no  weaker  constraints  before  them  in  the  method,  it  can  be  easily  under¬ 
stood  by  Theorem  13  that  it  generates  only  correct  solutions,  i.e.  is  sound.  In 
addition,  using  a  certain  kind  of  global  satisfiers,  the  refining  method  yields  all 
solutions,  i.e.  is  complete: 
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Proposition  14.  Let  S  be  global  such  that  for  any  jET,  0,  and  9',  -*9  7^  O'. 
For  any  H  and  (P,  <p)  of  H,  '7r5({P,  <p))  =  S(H)  if  P  =  {Bi  |  /  G  X}  and 
Bi  <p  Bi>  I  <  V ,  where  Bi  is  the  level  I  of  H. 

By  this  proposition,  a  refining-method  algorithm  using  a  global  hierarchy  and 

globally-bet  ter  comparator,  e.g.  least-squares-better,  is  sound  and  complete,  be- 

H 

cause  any  globally-better  comparator  satisfies  -^9  9' 7 

Next,  we  define  local  level  comparators^  local  constraint-hierarcy  comparators, 
and  local  constraint-hierarcy  satis fiers: 

•/f. 

Definition  15  (local  level  comparator).  A  level  comparator  <  is  local  iff  for 

H/l 

any  H,  9,  and  9\  (11)  9  <  9'  ^  ^c/l  G  H.ei{c,9)  <  e/(c,6>'). 

Definition  16  (local  constraint-hierarcy  comparator).  A  constraint-hier¬ 
arcy  comparator  is  local  iff  each  level  comparator  is  local. 

Definition  17  (local  constraint-hierarcy  satisfier).  A  constraint-hierarcy 
satisfier  is  local  iff  its  constraint-hierarcy  comparator  is  local. 

Hfl 

By  (4)  and  (11),  a  local  level  comparator  results  m9  <  9'  ^c/l  G  H.  e/(c,  9)  < 
ei(cy9'),  which  is  equivalent  to  level  comparators  of  locally-better  comparators 
in  the  original  formalization.  With  additional  restrictions  on  multi-way  equality 
constraints,  we  can  regard  our  formulation  as  a  theoretical  basis  of  eflScient  con¬ 
straint-hierarcy  satisfaction  algorithms  such  as  DeltaBlue. 

The  following  proposition  indicates  a  critical  difference  between  global  hier¬ 
archy  and  globally-better  comparators: 

Proposition  18.  Any  local  constraint-hierarcy  comparator  is  global 

The  original  formulation  presented  locally-better  and  globally-better  as  separate 
concepts.  However,  we  successfully  integrated  locally-better  and  an  important 
class  of  globally-better  into  global  hierarchy  comparators  via  GSM. 

Using  a  local  satisfier,  we  can  obtain  a  theorem  with  a  weaher  sufficient 
condition  than  that  of  Theorem  13: 

Theorem  19.  Let  S  be  local  Given  an  arbitrary  H,  (P,  <p)  of  H,  and  9 
in  7r5((P,  <p)),  then  9  is  a  solution  to  H  if 

VP  G  P-Vc/Z  G  B.ei(c,9)  >  0  =^^B'  e  RB'  <p  B  ^  ^c'/l'  eB'.l'  <l  .  (12) 

The  difference  of  (12)  from  (10)  is  the  existence  of  equality  in  V  <  /,  which 
indicates  that  (12)  is  weaker  than  (10).  Since  it  will  provide  more  freedom  to 
organize  ordered  partitions,  we  can  expect  to  develop  more  efficient  constraint 
solving  algorithms  using  local  satisfiers.  For  example,  we  can  regard  the  blocked 
constraint  lemma  presented  in  the  DeltaBlue  paper  [4]  as  a  specialization  of 
Theorem  19. 


^  It  is  probably  possible  to  weaken  the  sufficient  conditions  for  level  comparators  since 
the  condition  for  ordered  partitions  is  too  strong  in  the  refining  method. 
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GS:  global  satisfier 
LS:  local  satisfier 
GB:  globally-better 
LB:  locally-betler 
RB:  regionally-beiter 
LSB:  least-squares-better 
WSB:  weighted-sum-better 
WCB:  worst-case-better 
M:  monotonic 


Fig.  2.  Relationship  of  nonmonotonic  constraint  systems 


3.4  Discussion 

In  this  subsection,  we  review  the  relationship  among  nonmonotonic  constraint 
systems,  which  is  roughly  illustrated  in  Fig.  2. 

Partial  constraint  satisfaction  [5]  is  a  considerably  general  theory.  Therefore, 
it  will  include  various  nonmonotonic  systems,  which  are  not  necessarily  efficiently 
solvable. 

Our  reformulation  of  constraint  hierarchies  has  become  narrower  than  the 
original  one,®  because  we  necessitated  to  be  transitive  by  (5).  For  example, 

TT  /  I 

we  exclude  regionally-bet  ter  in  [11]  since  its  level  comparator  is  defined  as  ^  < 
9'  ^  \/c/l  e  H.ei(c,9)  <  e,{c,9')  A  3c/l  6  H.e,{c,9)  <  e,(c,e')  and  9  “rL'  ff 

HU  Hjl  fiji 

-1^  <  &  where  is  not  transitive.  However,  excluding  such  level 

comparators  contributed  to  theoretical  cleanness  and  development  of  generalized 
local  propagation. 

It  is  important  to  find  an  expressive  and  efficiently  solvable  class  of  nonmono¬ 
tonic  constraint  systems.  Except  regionally-better  and  worst-case-better,  all  the 
hierarchy  comparators  presented  in  the  original  formulation  are  global  by  our 
formulation.  We  believe  that  this  fact  supports  the  expressiveness  of  our  global 
satisfiers  with  respect  to  constraint  hierarchies.  Also,  we  claim  that  Theorem  13 
for  global  satisfiers  and  Theorem  19  for  local  satisfiers  are  useful  in  designing 
efficient  constraint  satisfaction  algorithms. 

4  The  DETAIL  Algorithm 

To  show  how  to  employ  the  results  in  the  last  section,  we  relate  them  with 
the  DETAIL  algorithm,  which  we  proposed  in  [6].  DETAIL  is  an  incremental 
algorithm  for  solving  constraint  hierarchies  based  on  local  propagation.  It  always 

®  Strictly  speaking,  as  noted  earlier,  our  theory  allows  conflicting  constraints  at  the 
top  level,  while  the  original  theory  restricts  top-level  constraints  to  be  required. 
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weak  E  medium  G  medium 


Fig.  3.  A  configuration  of  constraint  cells 


stores  planning  data  instead  of  an  appropriate  ordered  partition  of  the  current 
hierarchy,  and  modifies  the  plan  if  a  constraint  is  added  to  or  removed  from  the 
hierarchy. 

DETAIL  handles  multi-way  equality  constraints  extended  so  that  it  can  si¬ 
multaneously  satisfy  or  properly  relax  them,  in  addition  to  solving  them  individ¬ 
ually  as  is  with  classical  local  propagation.  To  process  such  constraints,  DETAIL 
maintains  a  set  of  constraint  cells  instead  of  an  ordered  partition  into  blocks. 
A  constraint  cell  can  be  regarded  as  a  block  including  output  variables,  where 
the  constraints  in  the  block  are  uniquely  solved  for  the  output  variables.  Also,  it 
never  shares  variables  with  any  other  cells.  For  example,  to  solve  the  constraint 
strong  xAy  =  3  for  variable  x,  DETAIL  yields  a  cell  of  strong  x  +  y  =  3  and 
X.  By  contrast,  to  simultaneously  solve  strong  x  +  y  =  3  and  weak  x  —  y  =  1,  it 
generates  a  cell  of  the  two  constraints  and  the  variables  x  and  y.  Similarly,  to 
relax  strong  x  =  0  and  strong  x  =  1,  it  produces  a  cell  consisting  of  the  two  con¬ 
straints  and  X.®  DETAIL  solves  such  constraint  cells  with  pluggable  numerical 
modules  called  subsolvers  using  e.g.  Gaussian  elimination. 

By  the  definition  of  constraint  cells,  we  can  determine  dependency  among 
cells.  Additionally,  if  we  prohibit  cyclic  dependency,  we  can  naturally  identify 
the  overall  dependency  among  cells  with  a  partial  order  among  blocks.  Then, 
we  can  perform  GLP  in  a  ‘unique’  manner  as  is  with  conventional  local  propa¬ 
gation.  For  example,  consider  the  hierarchy  with  the  constraints  [a],  [b],  . . . ,  [h] 
in  Fig.  3,  where  the  squares  and  circles  represent  constraints  and  variables  re¬ 
spectively,  and  the  boxes  with  round  corners  indicate  cells.  Clearly,  in  the  order 
respecting  the  cell  dependencies,  such  as  A,  jB,  iJ,  G,  and  we  can  uniquely 
solve  constraints  in  each  cell. 

The  other  issue  is  how  to  determine  configurations  of  cells  that  obtain  cor¬ 
rect  solutions.  To  guarantee  the  sufficient  condition  (10)  for  Theorem  13,  we 
employed  walkabout  strengths,  which  had  been  first  introduced  in  DeltaBlue  [4]. 
In  DETAIL,  walkabout  strengths,  associated  with  constraint  cells,  are  defined 
to  propagate  strengths  of  the  weakest  constraints.  For  example,  in  Fig.  3,  the 
walkabout  strength  medium  of  cell  G  is  inherited  from  the  weakest  constraint  [h] 


Sky  Blue  also  realizes  simultaneous  satisfaction  by  calling  ‘cycle  solvers,’  but  provides 
no  features  for  relaxing  constraints  [9]. 
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Fig.  4.  Adding  a  constraint  to  a  hierarchy  of  equalities 


among  all  the  constraints  that  the  variables  in  G  depend  on,  i.e.  [f],  [g],  and 
[h].  Therefore,  the  walkabout  strength  of  a  cell  indicates  that  there  are  only 
constraints  with  equal  or  stronger  strengths  before/in  the  cell.  Thus,  it  can  be 
easily  verified  whether  a  configuration  of  cells  satisfies  the  sufficient  condition 
for  Theorem  13.  For  example,  in  Fig.  3,  although  the  weak  constraints  [c]  and  [e] 
in  E  have  positive  errors,  the  walkabout  strengths  required  and  medium  of  the 
preceding  cells  A  and  G  indicate  that  all  the  forward  constraints  are  stronger 
than  weak. 

Now,  we  demonstrate  the  DETAIL  algorithm  by  example.  Figure  4a  illus¬ 
trates  the  initial  configuration  of  cells,  and  suppose  that  we  add  a  new  constraint 
[h]  medium  z  =  7  to  it.  The  current  solution  z  =  3  conflicts  with  [h],  and  the 
walkabout  strength  weak  of  G  shows  that  there  is  one  or  more  weak  constraints 
in  or  before  G.  Therefore,  we  must  change  the  configuration  in  the  following 
steps: 

1.  First,  move  along  the  path  from  the  new  cell  to  the  nearest  source  of  the 
walkabout  strength,  i.e.  from  H  to  E,  reversing  the  dependency  between 
them,  as  shown  in  Fig.  4b.  Note  the  multi-way  equality  property  of  con¬ 
straints  always  enables  us  to  perform  the  reversing  operation  [6]. 

2.  Next,  merge  cyclic  dependencies  generated  from  the  previous  step  if  any.  In 
the  example,  we  collapse  the  cycle  of  G^  and  F  as  illustrated  in  Fig.  4c. 

3.  Third,  check  whether  the  victimized  cell  E*  has  any  preceding  cells  with 
the  same  walkabout  strength  weak.  Figure  4c  shows  that  D  is  such  a  cell. 
Since  it  violates  the  suflftcient  condition  for  generating  solutions,  merge  all 
the  transitively  adjacent  cells  with  the  same  walkabout  strength,  i.e.  E\  T), 
and  C  (but  not  B).  Then,  we  obtain  the  final  configuration  in  Fig.  4d. 

In  step  3,  we  merged  all  the  transitively  adjacent  cells  with  the  same  walk¬ 
about  strength  to  ensure  the  sufficient  condition  (10)  for  global  satisfiers.  How¬ 
ever,  if  we  use  a  local  satisfier,  we  only  need  to  guarantee  the  weaker  condi¬ 
tion  (12),  and  therefore,  we  can  omit  step  3  (the  fined  configuration  would  have 
been  Fig.  4c).  DETAIL  also  provides  the  support  for  local  satisfiers,  which  usu¬ 
ally  results  in  smaller  constraint  cells  that  can  be  solved  more  efficiently. 
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5  Conclusions  and  Status 

We  reformulated  the  definition  of  constraint  hierarchies,  and  proposed  gener¬ 
alized  local  propagation  to  theoretically  study  local  propagation  therein.  We 
showed  that  globally  semi-monotonic  satisfaction  of  hierarchies  exhibits  a  prac¬ 
tically  useful  property  for  generalized  local  propagation. 

By  applying  the  results,  we  are  extending  the  DETAIL  algorithm  to  handle 
‘multi- way  inequality  constraints.’  We  already  established  its  basis,  and  actually 
implemented  a  prototype  constraint  solver.  Due  to  the  existence  of  inequalities, 
the  new  algorithm  is  exponential  in  time  complexity  unlike  the  original  DETAIL, 
which  is  polynomial.  Therefore,  we  are  mainly  exploring  performance  techniques 
such  as  efficient  scheduling  and  pruning  of  constraints. 
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Abstract.  We  present  a  general  methodology  for  transforming  between 
HCLP  and  PCSP  in  both  directions.  HCLP  and  PCSP  each  have  ad¬ 
vantages  when  modelling  problems,  and  each  have  advantages  when  im¬ 
plementing  models  and  solving  them.  Using  the  work  presented  in  this 
paper,  the  appropriate  paradigm  can  be  used  for  each  of  these  steps, 
with  a  meaning-preserving  transformation  in  between  if  necessary. 


1  Introduction 

The  Hierarchical  Constraint  Logic  Programming  (HCLP)  scheme  of  Borning, 
Wilson,  and  others  [3,  10,  12]  greatly  extends  the  expressibility  of  the  general 
CLP  scheme  [7].  A  semantics  has  been  defined  for  HCLP  [10,  11]  and  some 
instances  of  it  have  been  implemented  [8,  10]. 

The  Partial  Constraint  Satisfaction  (PCSP)  scheme  of  Freuder  and  Wallace 
[4,  6]  is  an  interesting  extension  of  CSP,  which  allows  the  relaxation  and  opti¬ 
misation  of  problems.  Extensive  empirical  studies  have  been  made  of  some  of  its 
instances  [6]. 

There  is  a  widespread  view  that  some  link  exists  between  particular  HCLP 
problems  and  particular  PCSP  problems,  but  no  general  method  of  transforming 
one  into  the  other  is  present  in  the  literature.  General  frameworks  have  been 
developed  of  which  HCLP  and  PCSP  are  particular  instances  [1,  9],  but  they  do 
not  provide  a  method  for  transforming  between  them.  In  this  paper  we  present 
a  completely  general  method  for  finding  the  HCLP  equivalent  of  any  PCSP 
problem,  and  vice  versa. 

Our  motivation  is  mainly  methodological,  to  allow  the  use  of  whichever 
paradigm  is  appropriate  for  specification,  even  if  the  other  one  is  more  appro¬ 
priate  for  execution.  But  we  also  have  a  more  theoretical  motivation,  namely 
to  show  the  relationship  between  the  two  formalism’s  orthogonal  approaches  to 
over-constrained  systems  (OCSs).  The  two  orthogonal  approaches  are  as  follows: 
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funded  by  the  European  Community  under  the  TMR  scheme. 
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as  a  Research  Associate.  Part  of  this  work  was  carried  out  in  the  context  of  the  INTAS 
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HCLP  reorganises  the  structure  of  an  over-constrained  problem,  by  specifying 
relationships  between  constraints;  PCSP  keeps  a  flat  structure  to  the  problem, 
but  changes  the  meaning  of  the  individual  constraints  (by  adding  elements  to 
the  domain). 

The  structure  of  this  paper  is  as  follows.  In  Sect. 2  we  introduce  HCLP  and 
PCSP.  We  then  make  some  preliminary  remarks  in  Sect. 3.  We  discuss  transform¬ 
ing  HCLP  into  PCSP  in  Sect.4,  and  the  transformation  from  PCSP  to  HCLP 
in  Sect. 5.  Both  these  sections  contain  illustrative  examples  and  pseudo-code. 
Finally  in  Sect. 6  we  present  some  conclusions  and  also  mention  further  work. 

2  Background 

2.1  Hierztrchical  Constraint  Logic  Programming 

A  good  introduction  to  HCLP  can  be  found  in  Molly  Wilson’s  PhD  thesis  [10, 
chapter  4]  or  in  the  early  reference  [3];  here  is  a  brief  overview.  CLP  can  be 
extended  to  a  Hierarchical  CLP  scheme  including  both  ‘hard’  and  ‘soft’  con¬ 
straints.  The  HCLP  scheme  is  parameterised  not  only  by  the  constraint  domain 
V  but  also  by  the  ‘comparator’  C,  which  is  used  to  compare  and  select  from  the 
different  ways  of  satisfying  the  soft  constraints. 

The  strengths  of  the  different  constraints  are  indicated  by  a  non-negative 
integer  label.  Constraints  labelled  with  a  zero  are  required  (hard),  while  con¬ 
straints  labelled  j  for  some  j  >  0  are  optional  (soft),  and  are  preferred  over 
those  labelled  A:,  where  k  >  j.  (A  program  can  include  a  list  of  symbolic  names, 
such  as  required^  strongly-preferred,  etc.,  for  the  strength  labels,  which  will  be 
mapped  to  the  natural  numbers  by  the  interpreter.  If  the  strength  label  on  a 
constraint  is  omitted,  it  is  assumed  to  be  required.) 

The  constraint  store  a  (a  set)  is  partitioned  into  the  set  of  required  constraints 
5o  and  the  set  of  optional  ones  Si.  The  solution  set  for  the  whole  hierarchy  is 
a  subset  of  the  solution  set  of  5o,  such  that  no  other  solution  could  be  ‘better’, 
i.e.  for  all  levels  up  to  A;,  Sk  is  completely  satisfied,  and  for  level  Sk+i  this 
solution  is  better  than  all  others,  in  terms  of  some  comparator.  Backtracking 
and  incomparable  hierarchies  give  rise  to  multiple  possible  solution  sets,  each  a 
subset  of  the  solution  to  Sq. 

‘Better’  is  defined  with  respect  to  some  comparator  [12].  The  key  notion 
is  that  a  comparator  is  a  function  from  a  solution  of  a  set  of  constraints  to  a 
sequence  of  numbers,  which  are  then  ordered  lexicographically;  the  first  element 
of  the  sequence  measures  how  well  the  solution  satisfies  the  required  constraints, 
the  second  how  well  the  strongest  optional  constraints  are  satisfied,  etc.;  the 
earlier  in  the  order,  the  better  that  solution  is. 

See  Sect.3.3  for  more  detail  on  the  particular  aspects  of  HCLP  involved  in 
the  transformations. 

2.2  Partial  Constraint  Satisfaction  Problems 

Freuder  has  developed  a  theory  of  Partial  Constraint  Satisfaction  Problems  (PC- 
SPs)  to  weaken  systems  of  constraints  which  have  no  solutions,  or  for  which 
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finding  a  solution  would  take  too  long  [4,  6].  PCSP  is  formalised  as  containing 
three  components 

where  P  is  a  constraint  satisfaction  problem  (CSP),  f/  is  a  set  of  ‘universes’  i.e.  a 
set  of  potential  values  for  each  of  the  variables  in  P,  (PS,  <)  is  a  problem  space 
with  PS  a  set  of  problems  and  <  a  partial  order  over  problems,  M  is  a  distance 
function  over  the  problem  space,  and  (N,  S)  are  necessary  and  sufficient  bounds 
on  the  distance  between  the  given  problem  P  and  some  solvable  member  of  the 
problem  space  PS. 

A  solution  to  a  PCSP  is  a  problem  P'  from  the  problem  space  and  its  solution, 
where  the  distance  between  P  and  P'  is  less  than  TV.  If  the  distance  between  P 
and  P'  is  minimal,  then  this  solution  is  optimal. 


Constraint  Satisfaction  Problems.  The  definition  of  a  constraint  satisfaction 
problem  is  well  known:  it  consists  of  a  pair  {V,  C)  where  y  is  a  set  of  variables, 
each  with  a  domain  (extension),  and  C  is  a  set  of  constraints^.  Solving  a  CSP 
involves  finding  one  value  from  the  domain  of  each  variable  such  that  all  the 
constraints  are  satisfied  simultaneously.  Generally  the  CSP  world  restricts  itself 
to  considering  binary  constraints  over  variables  with  finite  domains.  A  constraint 
c  between  two  variables  x  and  y  can  be  denoted 

(The  domains  of  the  variables  in  V  are  usually  considered  as  unary  con¬ 
straints,  but  in  order  to  simplify  the  presentation  in  [4]  they  are  represented  as 
binary  constraints  between  a  variable  and  itself.  The  value  v  is  in  the  domain 
of  a  variable  x  if  c^x  contains  (i?,w).  In  fact,  unless  there  are  elements  in  the 
domain  of  a  variable  which  do  not  appear  in  any  constraint,  it  is  redundant  to 
state  individual  variable  domains  explicitly:  we  can  always  reconstruct  them  by 
saying  that  U  {i\  (i,j)  €  Cuv  or  (k,i)  €  Cwu^  for  all  3,k,V,W].) 


The  Problem  Space.  A  problem  space  PS  is  a  partially-ordered  set  of  CSPs 
where  the  order  <  is  defined  as  follows  (sols(P)  denotes  the  set  of  solutions  to  a 
CSP  called  P): 

Pi  <  P2  iff  sols(Pi)  D  sols(P2) 

Note  that  the  ordering  is  over  problems,  but  defined  in  terms  of  solutions.  The 
problem  space  for  a  PCSP  must  contain  the  original  problem  P,  which  can 
provide  the  maximal  element  in  the  order,  for  standard  problem  spaces. (In  the 
most  general  case,  PS  can  in  fact  contain  Q  such  that  P  <  Q  or  such  that  P 
and  Q  are  incomparable.  But  if  we  take  the  conjunction  of  all  the  constraints 
in  all  the  problems  in  PS  and  create  a  single  problem  R,  then  R  will  definitely 
be  the  greatest  element  in  the  order.)  If  P  has  no  solutions,  then  sols(P)  =  {}, 
which  is  a  subset  of  all  other  sets. 

^  Constraints  are  relations  over  the  variables  in  V.  In  CSPs,  they  are  usually  treated 
extensionally,  i.e.  a  binary  constraint  is  just  considered  as  a  set  of  pairs. 
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The  obvious  problem  space  to  explore  when  trying  to  weaken  a  problem  is 
the  collection  of  all  problems  Q  such  that  Q  <  P,  but  it  may  also  be  useful  to 
consider  only  some  of  these  ^s,  i.e.  those  problems  which  have  been  weakened 
in  a  particular  way  which  makes  sense  in  the  context  of  the  system  that  we  are 
trying  to  model. 


Weakening  a  Problem.  There  are  four  ways  to  weaken  a  CSP:  (a)  enlarging 
the  domain  of  a  variable,  (b)  enlarging  the  domain  of  a  constraint,  (c)  removing 
a  variable,  and  (d)  removing  a  constraint.  Consider  example  Z  above:  if  none 
of  your  shirts  match  your  shoes,  you  could  buy  new  shoes  (variable  domain 
enlargement  /  augmentation),  you  could  decide  that  certain  shoes  do,  after  all, 
go  with  a  certain  shirt  (constraint  augmentation) ,  you  could  decide  not  to  wear 
shoes  at  all  (variable  removal),  or  you  could  ignore  clashes  between  shoes  and 
shirts  (constraint  removal).  (As  a  comparison  with  these  four  methods,  in  HCLP 
we  could  decide  that  the  constraint  that  shirts  match  shoes  is  simply  not  very 
important.) 

Preuder  shows  in  [4]  that  these  can  all  be  considered  in  terms  of  (b)  above  i.e. 
enlarging  constraint  domains  (adding  extra  pairs  to  the  relation  which  defines 
the  constraint),  (a)  As  we  have  already  decided  to  consider  the  domains  of  vari¬ 
ables  as  binary  constraints  Cxx,  domain  enlargement  can  clearly  be  achieved  by 
constraint  augmentation,  (d)  Enlarging  a  constraint  c^y  until  it  equals  xxy  (the 
cartesian  product  of  the  domains)  has  the  same  effect  as  removing  it  altogether, 
(c)  Removing  all  the  constraints  on  a  variable  achieves  the  aim  of  removing  the 
variable  itself. 


The  Distance  Function.  Different  distance  functions  are  possible,  but  one 
obvious  one  is  derived  from  the  partial  order  on  the  problem  space.  If  M (P,  P') 
equals  the  number  of  solutions  not  shared  by  P  and  P',  then  when  F  <  P 
the  distance  function  measures  how  many  solutions  have  been  added  by  the 
relaxation  of  P.  Another  distance  function  is  a  count  of  the  number  of  constraint 
values  not  shared  by  P  and  P',  and  yet  others  could  be  based  on  HCLP-like 
strength  labels.  Preuder  suggests  that  a  distance  function  may  be  used  which 
will  tend  to  find  weakened  problems  with  certain  properties,  for  example  one 
whose  constraint  graph  has  certain  structural  properties  (for  example,  see  [5]). 


3  Transformation:  Preliminary  Remarks 

HCLP  and  PCSP  are  not  identical  in  scope,  therefore  it  is  impossible  to  trans¬ 
form  all  of  HCLP  into  PCSP.  However,  the  work  presented  in  the  rest  of  this 
paper  is  complete  in  the  sense  we  present  transformations  for  every  single  aspect 
which  can  be  transformed.  First  of  all,  however,  we  discuss  those  parts  of  HCLP 
which  are  outside  the  scope  of  PCSP,  and  make  other  preliminary  remarks. 
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3.1  Differences  Which  Will  Not  Be  Transformed  Away 

Firstly^,  CLP  in  general  defines  a  class  of  programming  languages,  which  place 
constraint  solving  in  a  logic  programming  framework,  whereas  CSP  defines  a 
set  of  problems,  techniques,  and  algorithms.  We  could  embed  PCSP  in  a  logic 
programming  framework,  and  then  a  comparison  with  HCLP  would  make  sense, 
or  we  can  ignore  the  programming  language  aspects  of  HCLP,  and  compare 
the  resulting  theory  of  ‘constraint  hierarchies’  with  PCSP.  In  this  section  we  will 
consider  the  latter  approach,  i.e.  when  we  say  ‘HCLP’  we  really  mean  ‘constraint 
hierarchies’. 

Secondly,  CSP  techniques  are  always  defined  with  finite  domains  whereas  the 
CLP  framework  extends  to  continuous  domains  such  as  the  real  numbers.  We 
will  only  attempt  to  transform  HCLP(FD);  however,  we  will  transform  metric 
comparators  as  well  as  predicate  ones.  (Metric  comparators  required  a  notion  of 
‘distance’  between  points  in  the  domain,  but  there  is  no  reason  why  this  distance 
cannot  be  discrete.) 

Finally,  in  HCLP  the  required  constraints  are  special;  the  difference  between 
required  and  strong  constraints  is  richer  than  the  difference  between,  say,  strong 
and  weak.  PCSP  does  not  have  this  special  class  of  required  constraints.  This  is 
discussed  further  in  the  next  section. 


3.2  PCSP  with  Distinguished  Required  Constraints 

In  Sect. 2. 2,  we  presented  the  standard  formalisation  of  PCSPs  as  ((P,  [/),  (P5,  <), 
(M,  {N ,  5))).  We  can  modify  this  to  allow  us  to  denote  a  subset  of  the  constraints 
in  P  as  ‘required’,  giving  a  theory  which  can  be  called  I^CSP  (our  additions  in 
italics) : 

((P,P,P),(P5,<),(M,(iV,5))) 

where  P  is  a  constraint  satisfaction  problem,  R  ^  P  is  o,  set  of  constraints^  U 
is  a  set  of  ‘universes’  i.e.  a  set  of  potential  values  for  each  of  the  variables  in  P, 
{PS,  <)  is  a  problem  space  with  PS  a  set  of  problems  each  of  which  contains 
all  the  constraints  in  P,  and  <  a  partial  order  over  problems,  M  is  a  ‘distance 
function’  on  the  problem  space,  and  (N,  S)  are  necessary  and  sufficient  bounds 
on  the  distance  between  the  given  problem  P  and  some  solvable  member  of  the 
problem  space  PS.  A  solution  to  a  PCSP  is  a  problem  P'  from  the  problem 
space  and  its  solution,  where  the  distance  between  P  and  P'  is  less  than  TV,  and 
where  all  the  constraints  in  R  are  satisfied.  If  the  distance  between  P  and  P'  is 
minimal,  then  this  solution  is  optimal. 

In  Sect. 2. 2  we  noted  that  Freuder  states  that  the  obvious  problem  space  to 
explore  when  trying  to  weaken  a  problem  is  the  collection  of  all  problems  Q 

The  three  points  mentioned  in  this  section  axe  reasonably  straightforward,  but  have 
not  been  explicitly  made  in  any  publication.  They  were  mentioned  to  one  of  the 
authors  by  Horning  [Private  Communication],  but  we  were  already  aware  of  them 
independently. 
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such  that  Q  <  P,  but  we  also  noted  that  it  may  be  useful  to  consider  only  some 
of  these  Qs,  i.e.  those  problems  which  have  been  weakened  in  a  particular  way 
which  makes  sense  in  the  context  of  the  system  that  we  are  trying  to  model  [5]. 
Therefore  we  note  that  IJiCSP  can  be  considered  simply  as  selecting  those  Qs 
which  satisfy  all  the  constraints  in  R. 

One  way  to  select  the  appropriate  part  of  the  problem  space  is  to  choose  a 
distance  function  which  gives  an  infinitely  large  distance  for  all  other  parts.  If 
distance  functions  are  generally  denoted  by  from  now  on  we  will  assume  the 
existence  of  a  particular  function  //qo?  usually  parameterised  by  a  set  of  required 
constraints  or,  which  defines  a  distance  of  zero  to  any  problem  which  satisfies  all 
the  constraints  in  a,  and  a  distance  of  infinity  to  all  other  problems.  If  T  is  some 
arbitrary  problem  drawn  from  the  problem  space,  then 

_  f  0,  if  sols{T)  C  sols{a) 
fJ>oo{a)  “  I  otherwise 

fioo(crr)  will  be  the  first  element  of  the  sequence  of  functions  fi  =  Mw,  •  •  •] 

parameterised  by  the  constraints  at  each  level  of  the  hierarchy.  For  example,  if 
the  comparator  used  is  UCB,  then  =  [fJioo{crr)^  f^ucB(as)i  ^  •  •]• 

The  main  conclusion  of  this  section  is  that  we  can  deal  with  the  issue  of 
required  constraints  in  a  straightforward  and  localised  manner.  Therefore,  per¬ 
haps  surprisingly,  in  the  rest  of  this  paper  we  do  not  really  need  to  emphasise 
the  difference  between  PCSP  and  IJ^CSP. 


3.3  Characterisation  of  HCLP  and  PCSP 

In  this  section  we  present  those  aspects  which  are  relevant  for  the  transformation 
process.  The  relevant  aspects  for  HCLP  are 

(H  =  (Ho,Hi,H2,...]),C  =  (e,E,9)) 

where  H  is  a  hierarchy  of  constraints,  made  up  of  all  the  required  constraints  Ho, 
the  strongly  preferred  constraints  Hi ,  weaker  preferences  H2  etc.  The  compara¬ 
tor  C  is  used  to  compare  different  solutions;  it  is  made  up  of  an  error  function  e 
which  calculates  the  error  of  a  possible  solution  with  respect  to  one  constraint, 
E  which  simply  maps  e  pointwise  over  all  the  constraints  in  one  level  of  the 
hierarchy  Hi,  and  a  combining  function  g  which  combines  the  elements  of  the 
sequence  produced  by  E,  resulting  in  a  score  for  that  solution  with  respect  to 
all  the  constraints  at  that  level  of  the  hierarchy.  For  example  g  might  be  ‘max’, 
or  ‘sum’,  or  ‘least  squares’.  The  resulting  sequence  of  errors  [r,  . . ,]  giving 

the  errors  with  respect  to  each  level  if  the  hierarchy,  are  used  to  order  different 
possible  solutions  lexicographically.  The  lowest  element  in  the  order  indicates 
the  best  solution. 

PCSP  is  formalised  as  a  triple  ((P,  C/),  (PiS,  <),  (M,  (AT,  5))),  but  we  need 
only  consider  certain  elements  of  it  as  follows:  P  is  a  constraint  satisfaction 
problem,  and  M  is  a  distance  function  which  selects  the  consistent  problem 
‘nearest’  to  P. 
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When  transforming  HCLP  into  PCSP,  we  will  take  all  the  constraints  in  H 
without  their  strength  labels  as  being  P.  We  will  use  the  strength  label  infor¬ 
mation  and  the  comparator  to  construct  the  appropriate  distance  function. 

When  transforming  PCSP  into  HCLP,  the  constraints  in  the  hierarchy  will 
just  be  the  constraints  in  P,  and  the  distance  function  will  be  used  to  define 
their  strength  labels  (i.e.  which  of  the  Hi  should  contain  each  constraint)  and 
the  comparator  C. 

In  the  case  of  the  standard  PCSP  distance  function,  all  the  constraints  from 
P  must  be  placed  in  the  same  non-required  level  of  the  hierarchy,  but  it  does 
not  matter  which  one  is  used.  Arbitrarily,  we  choose  to  label  them  ‘strong’  and 
so  put  them  in  Hi. 

4  Transforming  HCLP  into  PCSP 

4.1  Creating  the  Distance  Function 

The  base  problem  (P)  is  all  the  constraints  in  the  hierarchy,  without  their 
strength  labels.  P,  P5,  and  (A^,  S)  remain  as  they  would  for  an  original  PCSP 
based  on  P.  (By  ‘original  PCSP’  we  mean  one  written  down  by  a  user,  as  opposed 
to  one  created  by  automatically  transforming  an  HCLP  problem.) 

The  distance  function  will  be  calculated  from  a  combination  of  the  HCLP 
comparator  and  the  particular  hierarchy  of  labelled  constraints,  and  the  hier¬ 
archy  will  lead  to  it  being  stratified  into  a  lexicographic  order.  The  distance 
function  derived  from  a  hierarchy  with  n  levels  will  be  stratified  into  n  parts, 
whose  results  will  be  ordered  lexicographically  (i.e.  it  will  not  calculate  a  single 
distance  of  the  relaxed  CSP  from  P).  Each  relaxation  (each  problem  drawn  from 
the  problem  space  PS)  will  be  annotated  with  a  sequence  [do,di,d2,  •  •  •  ,dn-i] 
each  element  of  which  is  calculated  by  the  respective  distance  function  in  /j,  = 
[//Q,  •  •  - ,  (The  required  level  is  formally  called  level  0,  the  strongest  non- 

required  level  is  1,  down  to  n  —  1  for  the  weakest  level.)  For  example,  in  the 
case  of  a  hierarchy  containing  only  required,  strong  and  weak  constraints,  each 
candidate  problem  will  be  annotated  with  a  sequence  [r,  s,  it;],  where  r  is  the  dis¬ 
tance  according  to  fir,  the  part  of  the  distance  function  derived  from  the  required 
constraints,  s  is  the  distance  according  to  fig,  the  part  of  the  distance  function 
derived  from  the  strong  constraints,  and  w  is  the  weak  distance,  calculated  by 
fiw  •  We  then  order  the  various  relaxations  according  to  the  lexicographical  order 
of  their  sequences. 

The  distance  function  calculates  the  distance  of  one  of  the  problems  T  in  the 
problem  space  PS  from  the  ‘ideal’  set  of  constraints  which  would  have  distance 
zero  (i.e,  completely  satisfy  all  the  constraints  in  the  original  problem).  In  fact, 
as  the  original  constraints  might  be  inconsistent,  it  is  possible  that  no  such  ideal 
set  exists. 

Let  us  define  sols{T)  to  be  the  set  of  solutions  to  T.  We  required  T  to  be 
consistent,  and  so  sols[T)  will  never  be  empty.  Each  member  of  sols{T)  is  a 
valuation,  i.e.  an  assignment  of  a  value  from  its  domain  to  each  variable  in  T. 
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We  can  calculate  how  well  a  particular  valuation  satisfies  the  constraints  a  using 
the  machinery  developed  by  Borning  and  Wilson  for  HCLP. 

T  may  have  more  than  one  solution,  and  hence  may  give  rise  to  more  than 
one  valuation,  therefore  we  define  the  distance  of  T  from  a  to  be  the  maximum 
of  distances  of  each  of  the  valuations  in  T.  This  is  necessary  because  HCLP’s 
comparators  take  as  input  the  set  of  original  constraints  and  a  single  valuation 
/  possible  solution.  The  output  is  the  score  for  that  particular  valuation,  which 
can  then  be  used  to  place  that  valuation  in  an  order.  In  PCSP,  however,  distance 
functions  create  an  order  over  sets  of  constraints;  a  set  of  constraints  can  have 
many  solutions,  and  so  we  have  to  choose  the  score  of  one  of  them.  We  choose 
the  worst  (largest)  score,  i.e.  this  set  of  constraints  can  never  give  an  answer 
with  a  score  worse  than  x.  For  example,  if  T  is  said  to  be  a  distance  of  2  from 
O’,  that  means  that  any  solution  of  T  is  a  distance  of  at  most  2  from  a. 

Therefore,  using  some  HCLP  terminology  including  denoting  a  general  com¬ 
parator  by  C  (defined  in  terms  of  g,  e,  and  E),  the  PCSP  distance  function  defined 
in  terms  of  the  set  a  of  constraints  from  one  optional  level  of  the  hierarchy  is: 

tiC{a)iT)  =  max{ff(E((TT)  |  t  e  sols(T)} 

In  other  words,  we  treat  all  the  constraints  in  cr  as  a  sequence,  apply  a  particular 
valuation  r  to  each  of  them,  calculate  the  error  for  each  member  of  the  sequence, 
combine  the  errors  using  g,  and  then  take  the  maximum  of  the  errors  for  all  the 
r  and  treat  it  as  the  error  for  T. 

The  various  distance  functions,  each  parameterised  by  the  constraints  from 
a  different  level  of  the  hierarchy,  will  lead  to  results  which  are  lexicographically 
ordered,  just  as  in  HCLP.  The  main  difference  between  standard  HCLP  and  our 
work  is  that  we  interpose  the  step  of  taking  the  maximum  error  for  each  of  the 
valuations  in  T  between  the  application  of  g  and  placing  in  an  order. 

In  the  case  of  UCB,  ^(v)  =  Zllli  Vi  and  e  =  is  the  simple  predicate  error 
function  which  returns  0  for  each  constraint  in  a  which  is  consistent  with  r,  and 
1  for  each  inconsistent  constraint  [2,  12].  E  is  e  raised  over  sequences,  i.e.  its 
input  is  a  sequence  of  constraints,  and  its  output  in  this  case  is  a  sequence  of  O’s 
and  I’s.  g  then  adds  all  these  individual  errors.  Least-squares-better  (LSB)  has 
a  more  complicated  e  =  e^,  which  measures  the  error  as  a  ‘distance’  in  a  metric 
space,  g  then  sums  the  squares  of  these  errors: 

(J'UCBMiT)  =  max  lY^ej,{c,T)\T  £  sols(T) 

IcGo- 

fJ'LSB(<r){T)  =  max  I  'Y^ei(c,Tf  [  t  e  sols{T) 

Vce<T 

4.2  Pseudo-code 

In  Fig.l  we  present  logic-programming-style  pseudo-code  which  transforms  HCLP 
into  PCSP  in  the  manner  described  in  this  section.  The  procedure  has  a  collec- 
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transf onn-HCLP-PCSP(  (LabelledConstraints ,  Comparator), 

(Constraints,  DF)  ) 
partition-constraints (  LabelledConstraints , 

(Required,  Strong,  Weak,  ...  ), 
remove-labels (  (Required,  Strong,  Weak,  ...), 

(UL-Required,  UL-Strong,  UL-Weak,  ,..)  ), 

*/,  DF  =  Distance  function. 

'/,  The  type  of  DF  is  determined  by  the  e  cind  g  functions 
*/,  (which  are  determined  by  the  choice  of  comparator)  ,  and 
y,  and  also  parameterised  by  the  different  levels  of  constraints 
distance-function-0 (  (Required,  Comparator),  DF-R (UL-Required)  ), 
distance-function-1 (  (Strong,  Comparator),  DF-S (UL-Strong)  ), 
distcince-f unction-2 (  (Weak,  Comparator),  DF-W (UL-Weak)  ), 

collect-distance-functions (  (DF-R (UL-Required) ,  DF-S (UL-Strong) , 
DF-W (UL-Weak) .. .),  DF  ), 

collect-constraints (  (UL-Required,  UL-Strong,  UL-Weak,...), 

Constraints  ) . 


Fig.  1.  HCLP  into  PCSP 


tion  of  labelled  constraints  and  a  comparator  as  input,  and  outputs  a  collec¬ 
tion  of  unlabelled  constraints  and  a  distance  function.  Therefore  if  we  do  not 
have  an  implementation  of  HCLP,  we  can  replace  a  call  to  it  by  the  two  calls 
transf orm-HCLP-PCSP,  PCSP. 


4.3  Example 

In  this  section  we  present  an  example  of  an  over-constrained  system  and  its 
specification  and  solution  in  HCLP,  and  then  show  its  transformation  into  PCSP. 

Consider  the  problem  of  choosing  matching  clothes  (example  adapted  from 
Freuder  and  Wallace  [6]).  A  robot  wishes  to  wear  a  shirt,  some  shoes,  and  some 
trousers,  and  wants  them  all  to  match  each  other.  There  are  various  choices  for 
the  different  items  and  various  constraints  between  them.  We  can  easily  model 
this  using  three  finite  domain  variables  with  a  number  of  binary  constraints 
between  them.  If  we  use  the  letter  S  to  denote  the  variable  for  shirts,  then  we  can 
use  F  for  shoes  (footwear)  and  T  for  trousers.  The  domain  of  the  shirt  variable 
will  be  5  ::  {r,  w]  for  red  and  white  respectively,  and  similarly  shoes  and  trousers 
will  have  domains  F  ::  {c,s}  for  cordovans  and  sneakers,  and  T  ::  {b,d,g},  for 
blue,  denim,  and  grey.  A  constraint  that  shirts  must  match  footwear  will  be 
denoted  Csf,  and  so  on.  Then,  using  Freuder  and  Wallace’s  assumptions  about 
which  clothes  go  with  which,  the  complete  problem  can  be  expressed  formally 
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as  follows  (we  will  call  this  model  Z)\ 

S  ::  F  ::  {c,s}.  T  ::  {b,d,g} 

CsT  ■■■■  {{r,g),{w,b),{w,d)},  Cft  ■■  {(s,rf),(c,5)},  Csf  ■■■  {(i«,c)} 

This  problem  is  over-constrained;  it  has  no  solutions.  We  can  see  this  by  choosing 
the  red  shirt,  and  tracing  the  implications  of  this  choice.  We  must  choose  the 
grey  trousers,  which  forces  us  to  choose  the  cordovans  as  footwear.  But  according 
to  CsFi  the  cordovans  only  go  with  the  white  shirt.  Contradiction.  We  can 
trace  the  effects  of  choosing  the  white  shirt  in  the  same  way,  also  arriving  at  a 
contradiction.  Therefore  we  need  to  consider  some  way  of  relaxing  or  weakening 
the  problem  until  solutions  can  be  found. 


Example  in  HCLP.  Let  us  use  HCLP  strength  labels  to  indicate  our  assump¬ 
tion  that,  say,  shirts  and  trousers  are  more  important  than  footwear,  and  let  us 
choose  the  unsatisfied- count-better  (UCB)  comparator: 

strong  CsT,  weak  Cft,  weak  Csf 

The  solutions  to  this  hierarchy  will  equal  the  solutions  to  the  two  equally  ac¬ 
ceptable  relaxed  problems  {Cst,  Cft)  and  {Cst,  Csf)  which  are,  in  the  variable 
order  (5,F,T),  {(r,c,p),  (re,  s,  d)}  and  {{w,c,b),{w,c,d)}  respectively. 


HCLP  Formulation  Transformed  into  PCSP.  The  base  set  of  constraints 
for  the  PCSP  formulation  will  be  all  the  constraints  from  the  HCLP  version, 
but  without  strength  labels.  The  distance  function  will  be  in  two  parts  = 
[fJ'UCBiCsT)^  f^uCB{CFT,CsF)h  of  wMch  measures  the  relaxation  of  the  strong 
constraint,  and  another  for  the  weak  level  of  the  hierarchy.  The  order  will  be  the 
lexicographic  order  over  the  sequences  of  integers  [s,ic]  produced  by  fi. 

UCB  and  ^ucB{a)  are  elsewhere  in  this  paper,  as  is  the  notion  of  constraint 
augmentation.  Here  it  suffices  to  say  that  the  best  solutions,  i.e.  those  earliest 
in  the  order  created  by  the  distance  function,  will  be  the  sets  of  constraints 
{Cst,Cft,C'sf},  {Cst,Cft.C'^f)^  {Cst,C'ft,Csf},  and 
where  Cgp  =  {(«;,  c),  (r,  c)},  i.e.  Csf  augmented  with  the  extra  tuple  (r,  c),  and 
the  other  three  solutions  also  contain  one  augmented  constraint  {CgF  — 

=  {(s,d),(c,p),(c,b)},  CJJr  =  {(a,  (i),(c,p),(c,d)}).  The  solutions 
from  these  four  sets  of  constraints,  in  variable  order  {S,F,T),  are  {(r,  c,  <;)}, 
{(iy,s,d)},  {{w,c,b)},  and  {(iu,c,  d)},  identical  to  the  HCLP  solutions. 

5  Transforming  PCSP  into  HCLP 

5.1  Transforming  the  Standard  PCSP  Distance  Function 

To  transform  PCSP  with  the  standard  distance  function  into  HCLP,  we  take 
the  constraints  in  P  and  give  them  all  the  same  arbitrary  non-required  strength 
label,  say  ‘strong’.  Thus  they  will  be  placed  in  Hi.  Then  we  use  the  HCLP 
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comparator  unsatisfied- count-better  (UCB).  We  claim  that  this  is  the  correct 
comparator  to  use,  i.e.  we  claim  that  the  solutions  calculated  by  HCLP  using 
UCB  are  the  same  as  those  in  PCSP,  and  the  particular  solutions  which  are 
best  according  to  PCSP  will  also  be  best  according  to  UCB.  (The  intuition  is 
as  follows:  the  number  of  unsatisfied  constraints  counted  by  UCB  is  the  same 
as  the  number  of  constraints  which  would  need  a  single  domain  augmentation 
to  create  a  consistent  CSP,  thus  UCB  measures  an  equivalent  distance  to  that 
measured  in  PCSP.) 

Certain  combinations  of  augmented  constraints  in  the  PCSP  formulation, 
which  duplicate  solutions  found  at  a  closer  distance,  will  not  appear  in  the 
HCLP  answer,  but  all  the  solutions  to  these  combinations  will  appear.  (Here 
is  an  analogy:  if  the  list  of  PCSP  solutions,  in  order  from  best  to  worst,  is 
[a,  6,  c,  a,  d,  a,  e],  the  list  of  HCLP  solutions  may  be  [a,  6,  c,  d,  e].  So  although  the 
lists  are  not  equal,  the  fact  that  a  should  be  chosen  before  6  or  d  is  present  in 
both  representations.)  See  Fig. 2  for  pseudo-code. 


Detailed  Defence  of  Choice  of  UCB.  This  section  contains  a  detailed  de¬ 
fence  of  our  choice  of  UCB  as  the  comparator  to  use  in  HCLP  when  transforming 
from  PCSP.  It  addresses  one  possible  key  objection,  but  does  not  affect  the  pre¬ 
sentation  in  subsequent  sections  of  the  paper. 

Consider  those  PCSP  weakenings  which  involve  more  than  one  augmentation 
of  a  single  constraint.  We  claim  that  the  following  complaint  about  our  choice 
of  UCB  is  unjustified:  “UCB  will  just  detect  that  a  constraint  had  been  violated 
by  a  valuation.  It  wouldn’t  detect  that  two  different  augmentations  would  be 
necessary  for  the  constraint  not  to  be  violated.”  It  is  incoherent  because  two 
augmentations  can  never  be  necessary  for  a  single  constraint  not  to  be  violated. 
Two  augmentations  to  a  single  constraint  might,  however,  lead  to  an  additional 
two  or  more  solutions,  but  we  can  ignore  this  situation  due  to  the  following 
claim: 

Claim:  the  additional  solutions  caused  by  n  >  2  augmentations  of  a  single 
constraint  can  be  completely  separated  into  n  classes,  each  of  which  contains 
solutions  caused  by  only  one  of  the  n  augmentations.  The  CSPs  represented 
by  these  singly-augmented  constraints  will  all  appear  in  the  partial  order  in¬ 
duced  by  the  distance  function,  and  they  will  all  appear  earlier  than  the  CSP 
containing  the  n-augmented  constraint.  Therefore,  no  solutions  will  be  lost  by 
ignoring  all  multiply-augmented  constraints.  Therefore,  the  fact  that  UCB  only 
picks  out  those  solutions  which  violate  the  smallest  number  of  singly-augmented 
constraints,  does  not  change  the  set  of  solutions  computed.  (All  that  would  hap¬ 
pen  is  that  two  solutions  si  and  S2  will  separately  appear  as,  say,  the  equal-best 
solutions  to  the  hierarchy,  but  their  union  will  fail  to  appear  as  a  second-best  or 
third-best  solution.) 

Example:  Let  A‘  denote  the  constraint  A  with  one  extra  tuple  added  to 
its  domain,  in  the  usual  manner.  Usually  there  will  be  more  than  one  way  to 
augment  A;  these  alternatives  may  be  indicated  by  A^,  A2,  etc.  Let  A”  gener¬ 
ally  denote  two  augmentations  to  A,  and  specifically  A"  2  denote  that  the  two 
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transforra-PCSP-HCLP(  (Constraints,  DF(Type,SpecialCons)) , 

(LabelledConstraints ,  Comparator)  ) 

*/,  if  Type  =  standard,  then  UCB  comparator  will  be  chosen,  etc. 
y,  if  no  constraints  are  highlighted  by  the  distance  function, 
y,  then  SpecialCons  will  be  empty  and  all  constraints  are  ^ strong^ 

partition-constraints (  (Constraints,  DF (Type, SpecialCons)) 
(UL-Strong,  UL-Weak,  ...  ), 

add-labels (  (UL-Strong,  UL-Weak,  ...),  (Strong,  Weak,  ...)  ), 
create-comparator (Type ,  Comparator) 

collect-constraints (  (Strong,  Weak,  ...),  LabelledConstraints  ). 


Fig.  2.  PCSP  into  HCLP 


augmentations  are  equivalent  to  A[  U  A2 .  Then  our  claim  is  that  all  the  solu¬ 
tions  to  the  CSP  {^1  ^5  6}  present  in  the  union  of  the  solution  sets 

{A[,B'3,C^}  U  {A'2,bI,C^}  U  {A[,B[,C^}  U  {A'2.,B'^,C'^}  U  ....  in  other  words, 


we  can  ignore  multiple  augmentations  of  a  single  constraint. 

Intuition:  Consider  the  CSP  as  a  graph,  with  each  variable  represented  by 
a  node  and  each  constraint  represented  by  an  edge.  The  tuples  which  make  up 
the  constraint  are  labels  for  the  edges.  A  solution  to  the  CSP  is  a  path  through 
every  edge  in  the  graph,  consistent  with  the  labels.  If  we  add  a  label  to  an  edge, 
we  are  increasing  by  one  the  number  of  paths  between  the  two  nodes  connected 
by  that  edge^.  If  instead  we  added  a  different  label,  we  would  again  increase 
the  number  of  paths  between  these  two  nodes  by  one.  It  is  intuitively  clear  that 
adding  these  two  labels  simultaneously  will  add  precisely  two  paths  between  the 
two  nodes:  any  path  can  only  take  account  of  one  of  the  two  labels  on  the  edge. 
We  could  have  arrived  at  the  same  set  of  total  paths  through  the  graph  by  taking 
two  copies  of  the  original  graph,  adding  one  new  label  to  each  of  them,  finding 
the  new  paths  caused  by  this  single  extra  label,  and  then  eventually  taking  the 
union  of  the  two  sets  of  paths. 

Proof:  Consider  various  binary  constraints  over  different  pairs  selected  from 
n  variables  Xi,  X2,  X3,  etc.  We  can  define  the  expansion  of  each  constraint 
Cij ,  which  originally  related  Xi  and  Xj ,  to  a  set  of  n-tuples  by  creating  a  tuple 
for  each  element  of  the  cartesian  product  of  the  variables  not  originally  involved 
in  the  constraint: 


Clj  =  I  {vi,Vj)  e  Cij,{vk,k^i,k^j)  6  dom(Xfc)} 

’  The  number  of  paths  through  the  entire  graph  may  increase  by  more  than  one.  If 
there  are  k  paths  leading  into  the  start  node  of  the  edge  under  consideration,  and  I 
paths  leading  away  from  the  end  node,  then  adding  a  path  between  the  two  nodes 
may  increase  the  number  of  paths  through  the  entire  graph  by  up  to  kl. 
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Example:  if  X  has  domain  {a,  fe},  Y  has  domain  {c,  d},  and  Z  has  domain  {e,  /}, 
and  if  Axy  =  {(a,  c),  (6,  d)},  then  A’^y  =  {(a,c,e),(a,c,f),(b,d,e),(b,d,f)}. 

It  is  clear  that  the  solution  to  a  CSP  is  precisely  the  intersection  of  the 
expanded  versions  of  each  of  its  constraints.  Thus  instead  of  considering  the 
solution  of  the  set  of  constraints  {Axy,  Byz,  C'xz},  we  can  just  consider  A*j^y  ^ 
ByzCiC^z-  (This  is  the  similar  to  the  relational  database  idea  of  a  ‘join’  between 
two  relations.) 

If  we  add  one  pair  to  the  domain  of  one  of  the  constraints  in  a  CSP,  it  is 
equivalent  to  adding  a  set  of  n- tuples  to  the  domain  of  that  constraint’s  ex¬ 
panded  version,  where  the  other  places  in  the  tuple  are  filled  with  all  possible 
combinations  of  elements  from  the  domains  of  all  the  other  variables.  Contin¬ 
uing  with  the  example,  let  us  assume,  without  loss  of  generality,  that  we  have 
augmented  constraint  B.  This  leads  to  adding  a  set  of  n-tuples  to  B*]  let  us 
call  this  set  of  additional  tuples  R.  We  can  imagine  adding  a  different  pair  to  B 
which  would  lead  to  adding  a  different  set  to  B*,  say  R'.  If  we  add  both  pairs 
to  B  at  the  same  time,  it  is  clear  that  we  must  add  RU  R'  to  B*. 

Our  claim  is  that  we  can  ignore  CSPs  where  one  constraint  has  been  multiply 
augmented;  all  their  solutions  will  be  present  in  the  union  of  the  solutions  to 
CSPs  with  singly-augmented  constraints.  This  is  equivalent  to  claiming 

A*  n  (B*  u  {RUR'))  nc*  =  {A*  n  {b*  uR)n c*)u{A*  n  {b*  uRl_)n c*) 

The  proof  is  a  straightforward  exercise  in  the  use  of  the  distributivity  laws  of 
set  theory  {JU{KnL)  =  {JU  K)n{JU  L)  and  its  dual),  with  one  use  of  the 
idempotence  of  set  union  {K  U  K  =  K). 

Therefore,  using  UCB  as  our  comparator  in  the  automatically  generated 
HCLP  version  of  a  PCSP  is  acceptable.  So  our  transformation  from  PCSP  to 
HCLP  holds. 


5.2  Transforming  Non-Standard  Distance  Functions 

We  have  shown  above  how  to  transform  problems  using  the  standard  PCSP 
distance  function  into  HCLP.  We  now  consider  three  other  possibilities,  firstly 
where  all  the  variables  and  constraints  are  treated  equally  by  the  distance  func¬ 
tion  but  the  distance  is  not  defined  as  minimum  augmentation,  secondly  where 
some  of  the  variables  in  the  problem  are  highlighted,  and  finally  where  some  of 
the  constraints  are  highlighted. 


Non-Specific  (Homogeneous)  Distance  Functions.  All  the  constraints  are 
put  at  the  ‘strong’  level  of  the  hierarchy  resulting  from  the  transformation.  The 
combining  function  embodied  by  the  distance  function  must  be  transformed  into 
an  HCLP-like  comparator,  specifically  into  an  error  function  for  each  constraint 
and  a  combining  function  which  combines  the  errors  at  each  level. 
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?-  HCLP(  (LabelledConstraints,  Comparator),  HCLP-Solutions  ), 
transf orm-HP(  (LabelledConstraints,  Comparator), 

(Constraints,  DF)  ) 

PCSP(  (Constraints,  DF) ,  PCSP-Solutions  ), 
equiv(  HCLP-Solutions,  PCSP-Solutions  ). 


Fig.  3.  Equivalence 


Distance  Functions  which  Prefer  a  Subset  of  the  Variables.  In  general 
CSPs  are  considered  in  terms  of  binary  constraints.  The  theory  can  be  extended, 
but  complications  are  introduced.  CLP,  on  the  other  hand,  is  indifferent  to  the 
arity  of  constraints.  Therefore,  if  a  PCSP  problem  has  some  kind  of  cost  function 
which  selects  solutions  which  minimise  the  value  of  some  function  of  (some  of) 
the  variables,  we  can  simply  treat  it  as  another  constraint.  If  the  use  of  the 
cost  function  is  expressed  in  the  usual  way  (“Do  not  violate  any  constraints 
in  order  to  minimise  the  function”)  then  it  can  be  labelled  ‘weak’,  while  all  the 
constraints  in  the  original  PCSP  are  labelled  ‘strong.’  If  it  is  acceptable  to  violate 
constraints  in  order  to  minimise  the  function,  then  the  inverse  strength  labelling 
can  be  used. 


Distance  Functions  which  Prefer  a  Subset  of  the  Constraints.  This 
possibility  can  be  transformed  into  HCLP  in  a  very  straightforward  manner:  the 
preferred  constraints  are  labelled  ‘strong’,  while  the  others  are  labelled  ‘weak’.  If 
there  are  multiple  subsets  with  some  order  over  them,  then  clearly  more  HCLP 
strength  levels  can  be  used. 

6  Conclusions  and  Further  Work 

6.1  Conclusions 

We  have  developed  a  general  methodology  for  transforming  between  HCLP  and 
PCSP.  We  have  clarified  various  issues,  and  provided  a  proof  of  correctness.  We 
have  shown  that  strength  labels,  associated  with  constraints  in  HCLP,  contain 
information  which  is  necessary  to  define  the  global  distance  function  in  PCSP. 

The  main  claim  of  this  paper  can  be  stated  as  a  query  in  logic  program¬ 
ming  terms  (Fig.3).  We  assume  the  existence  of  two  procedures,  each  of  which 
interfaces  to  a  standard  implementation  of  HCLP  and  PCSP  respectively.  An 
equivalent  claim  to  that  in  Fig.  3  is  that  if  we  do  not  have  an  implementation 
of  HCLP,  we  can  replace  a  call  to  it  by  the  two  calls  transf  orm-HP ,  PCSP,  and 
vice-versa. 

HCLP  and  PCSP  each  have  advantages  when  modelling  problems,  and  each 
have  advantages  when  implementing  models  and  solving  them.  Using  the  work 
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presented  in  this  paper,  the  appropriate  paradigm  can  be  used  for  each  of  these 
steps,  with  a  meaning-preserving  transformation  in  between  if  necessary. 


6.2  Further  Work 

We  would  like  to  investigate  issues  of  algorithmic  complexity  within  the  two 
paradigms. 
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Abstract.  Many  combinatorial  search  problems  can  be  expressed  as 
‘constraint  satisfaction  problems’,  and  this  class  of  problems  is  known  to 
be  NP-complete  in  general.  In  this  paper  we  investigate  restricted  classes 
of  constraints  which  give  rise  to  tractable  problems.  We  show  that  any  set 
of  constraints  must  satisfy  a  certain  type  of  algebraic  closure  condition 
in  order  to  avoid  NP-completeness.  We  also  describe  a  simple  test  which 
can  be  applied  to  establish  whether  a  given  set  of  constraints  satisfies  a 
condition  of  this  kind.  The  test  involves  solving  a  particular  constraint 
satisfaction  problem,  which  we  call  an  ‘indicator  problem’. 

Keywords:  Constraint  satisfaction  problem,  complexity,  NP-completeness, 
indicator  problem 


1  Introduction 

Solving  a  constraint  satisfaction  problem  is  known  to  be  an  NP-complete  problem 
in  general  [13]  even  when  the  constraints  are  restricted  to  binary  constraints. 
However,  many  of  the  problems  which  arise  in  practice  have  special  properties 
which  allow  them  to  be  solved  efficiently.  The  question  of  identifying  restrictions 
to  the  general  problem  which  are  sufficient  to  ensure  tractability  is  important 
from  both  a  practical  and  a  theoretical  viewpoint,  and  has  been  extensively 
studied. 

Such  restrictions  may  either  involve  the  structure  of  the  constraints,  in  other 
words,  which  variables  may  be  constrained  by  which  other  variables,  or  they 
may  involve  the  nature  of  the  constraints,  in  other  words,  which  combinations  of 
values  may  be  allowed  for  variables  which  are  mutually  constrained.  Examples 
of  the  first  approach  may  be  found  in  [4,  5,  7,  14,  15]  and  examples  of  the  second 
approach  may  be  found  in  [2,  9,  10,  11,  14,  18,  19]. 

In  this  paper  we  take  the  second  approach,  and  investigate  those  classes 
of  constraints  which  ensure  tractability  in  whatever  way  they  are  combined.  A 
number  of  distinct  classes  of  constraints  with  this  property  have  previously  been 
identified  and  shown  to  be  maximal  [2,  10,  9],  It  is  currently  unknown  whether 
there  are  any  further  tractable  constraint  classes  still  to  be  identified. 

In  [9]  we  showed  that  aU  known  examples  of  such  classes  are  characterized 
by  a  simple  algebraic  closure  condition.  This  naturally  raised  the  question  of 
whether  all  possible  tractable  classes  satisfy  such  a  closure  condition.  In  this  pa¬ 
per  we  answer  this  question  by  proving  that  any  class  of  constraints  which  does 
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not  give  rise  to  NP-complete  problems  must  indeed  satisfy  such  a  closure  condi¬ 
tion,  and  hence  this  property  is  a  necessary  condition  for  a  class  of  constraints 
to  be  tractable  (assuming  that  P  is  not  equal  to  NP). 

Furthermore,  we  describe  a  simple  test  to  establish  whether  a  given  set  of 
constraints  satisfies  this  necessary  algebraic  condition.  The  test  involves  calculat¬ 
ing  the  solutions  to  a  fixed  constraint  satisfaction  problem  involving  constraints 
from  the  given  set. 

The  paper  is  organised  as  follows.  In  Section  2  we  give  the  basic  definitions, 
and  describe  a  general  form  of  algebraic  closure  condition  for  a  set  of  relations. 
In  Section  3  we  show  that  this  condition  is  necessary  for  tractability,  and  in 
Section  4  we  describe  a  test  for  this  condition.  Finally  we  summarise  the  results 
presented  and  draw  some  conclusions. 


2  Definitions 

2.1  The  constraint  satisfaction  problem 

Notation  1  For  any  set  D,  and  any  natural  number  n,  we  denote  the  set  of  all 
n-tuples  of  elements  of  D  by  .  For  any  tuple  t  G  ,  and  any  i  in  the  range 
1  to  n,  we  denote  the  value  in  the  ith  coordinate  position  oft  by  t[i].  The  tuple 
t  will  be  written  in  the  form  (t[l],<[2], . .  .,<[71]). 

A  subset  of  is  called  an  n-ary  relation  over  D. 

We  now  define  the  (finite)  constraint  satisfaction  problem  which  has  been 
widely  studied  in  the  Artificial  Intelligence  community  [13,  14,  12] 

Definition  2.  An  instance  of  a  constraint  satisfaction  problem  consists  of 

—  a  finite  set  of  variables,  V ; 

—  a  finite  domain  of  values,  D; 

—  a  set  of  constraints  {Ci,  C2, . . . ,  (7^}. 

Each  constraint  Ci  is  a  pair  (5*,  Ri),  where  Si  is  a  list  of  variables  of  length 
TUi,  called  the  constraint  scope,  and  Ri  is  an  mi-ary  relation  over  D,  called 
the  constraint  relation.  The  tuples  of  Ri  indicate  the  allowed  combinations 
of  simultaneous  values  for  the  variables  in  Si. 

The  length  of  the  tuples  in  the  constraint  relation  of  a  given  constraint  will  be 
called  the  arity  of  that  constraint.  In  particular,  unary  constraints  specify  the 
allowed  values  for  a  single  variable,  and  binary  constraints  specify  the  allowed 
combinations  of  values  for  a  pair  of  variables.  A  solution  to  a  constraint  satisfac¬ 
tion  problem  is  a  function  from  the  variables  to  the  domain  such  that  the  image 
of  each  constraint  scope  is  an  element  of  the  corresponding  constraint  relation. 

Deciding  whether  or  not  a  given  problem  instance  has  a  solution  is  NP- 
complete  in  general  [13]  even  when  the  constraints  are  restricted  to  binary  con¬ 
straints.  In  this  paper  we  shall  consider  how  restricting  the  allowed  constraint 
relations  to  some  fixed  subset  of  all  the  possible  relations  affects  the  complexity 
of  this  decision  problem.  We  therefore  make  the  following  definition. 
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Definitions.  For  any  set  of  relations,  T,  CSP(r)  is  defined  to  be  the  class  of 
decision  problems  with 

INSTANCE:  A  constraint  satisfaction  problem  P  in  which  all  constraint  rela¬ 
tions  are  elements  of  P . 

QUESTION:  Does  P  have  a  solution? 

If  there  exists  an  algorithm  which  solves  every  problem  in  CSP(r)  in  polynomial 
time,  then  we  shall  say  that  J*  is  a  tractable  set  of  relations. 

Example  1.  The  binary  inequality  relation  over  a  set  D,  denoted  is  defined 
as 

Note  that  CSP({^d})  corresponds  precisely  to  the  Graph  |D|-ColoRABILITY 
problem  [6],  This  problem  is  tractable  when  jD]  <  2  and  NP-complete  when 
\D\  >  3. 

Example  2.  We  now  describe  four  relations  which  will  be  used  as  examples  of 
constraint  relations  throughout  the  paper. 

Each  of  these  relations  is  a  set  of  tuples  of  elements  from  the  domain  D  = 
{0, 1,  2},  as  defined  below: 


-Ri  =  {<0,0), 

JJ2  =  {  (0, 1,  2), 

{1,2>, 

(1,2,0), 

(0,1). 

(2,0,1)  } 

(2.1)} 

Ri  =  {  (0, 1), 

Ri  =  {  (0, 1), 

(0,2), 

(0,2), 

(1.0). 

(1,0), 

(1.1). 

(1,2), 

(1.2), 

(2.0), 

(2,0), 

(2.1)} 

(2,1), 

(2.2)  } 

CSP({iii,  i?2,  ^3,  -R4})  contains  all  constraint  satisfaction  problems  in  which  the 
constraint  relations  are  all  equal  to  one  of  iii,  -R2j  ^3  or  R^. 

Note  that  ^  so  by  Example  1,  CSP(r)  is  NP-complete  for  any 

subset,  r,  of  the  set  {Ri,  i?2,  -R3,  Ra}  containing  the  relation  R^. 

The  complexity  of  CSP(r)  for  subsets  of  {Ru  R2,  Rz.  Ra}  which  do  not  con¬ 
tain  R^  will  be  determined  using  the  techniques  developed  later  in  this  paper. 
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2.2  Operations  on  relations 

In  Section  3  we  shall  examine  conditions  on  a  set  of  relations  F  which  allow 
known  NP-complete  problems  to  be  reduced  to  CSP(r).  The  reductions  will 
be  described  using  standard  operations  from  relational  algebra  [1],  which  are 
described  in  this  section. 

Definition  4.  We  define  the  following  operations  on  relations. 

-  Let  Ri  be  an  n-ary  relation  over  a  domain  D  and  let  R2  be  an  m-ary  relation 
over  D.  The  Cartesian  product  Ri  x  R^  is  defined  to  be  the  (n  +  m)-arv 
relation 

RixR2  =  {(<[1].  <[2],  ...,t{n  +  m]}\  ((([1],  t[2], ....  i[n])  €  -Ri)  A 

«<[n  -f  1],  t[n  -b  2], . . . ,  t [n  +  to])  6  B2)}- 

-  Let  R  be  an  n-ary  relation  over  a  domain  D.  Let  1  <  i,j  <  n.  The  equality 
selection  (ri—j(^R'j  is  defined  to  be  the  n-axy  relation 

<Ti^j{R)  =  {teR\  t[i]  =  t[j]}. 

-  Let  R  be  an  n-ary  relation  over  a  domain  D.  Let  n, . . . ,  4  be  a  subsequence 
of  1, ....  n.  The  projection  rci^,...,i^(R)  is  defined  to  be  the  /t-ary  relation 

I  i  C.  ii}. 

It  is  well-known  that  the  combined  effect  of  two  constraints  in  a  constraint  sat¬ 
isfaction  problem  may  be  obtained  by  performing  a  relational  join  operation  [1] 
on  the  two  constraint  relations  [7].  The  next  result  is  a  simple  consequence  of 
the  definition  of  the  relational  join  operation. 

lemma  5.  The  join  of  relations  Ri  and  R2  can  be  calculated  by  performing  a 
sequence  of  Cartesian  product,  equality  selection  and  projection  operations  on 
Ri  and  R2. 

In  view  of  this  result,  it  will  be  convenient  to  use  the  following  notation. 

Notation  6  The  set  of  all  relations  which  may  be  obtained  from  a  given  set  of 
relations,  F,  using  some  sequence  of  Cartesian  product,  equality  selection,  and 
projection  operations  will  be  denoted 

2.3  Operations  on  tuples 

We  will  show  in  Section  3  that  any  tractable  set  of  relations  over  a  set  D  must 
satisfy  certain  algebraic  conditions.  In  order  to  describe  these  conditions  we  need 
to  consider  arbitrary  operations  on  D,  in  other  words,  arbitrary  functions  from 
D  to  D,  for  arbitrary  values  of  k. 

Any  such  operation  on  D  may  be  extended  to  an  operation  on  tuples  over  D 
by  applying  the  operation  in  each  coordinate  position  separately  (i.e.,  pointwise). 
Hence,  any  operation  defined  on  the  domain  of  a  relation  may  be  used  to  define 
an  operation  on  the  tuples  in  that  relation,  as  follows: 
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Definition  7.  Let  R  be  an  n-ary  relation  over  a  domain  D,  and  let  0  :  D**  ^  D 
be  a  k-aiy  operation  on  D. 

For  any  collection  of  k  tuples,  ti, ^2?  •  ■  •  ^  (not  necessarily  all  distinct) 

the  n-tuple  0(ti,  ^2,  •  ♦  ♦ ,  is  defined  as  follows: 

j  ^2  J  •  ■  •  5  ^fe)  — 

(0(ti[l],^2[l],...,4[l]),<S»(ti[2],t2[2],...,4[2]),...,0(ti[n],t2W,.--,4N)>. 

Using  this  definition,  we  now  define  the  following  closure  property  of  relations. 

Definitions*  Let  be  a  relation  over  a  domain  D,  and  let  0  :  D  be  a 

A;-ary  operation  on  D. 

R  is  said  to  be  closed  under  0  if,  for  all  ti,  ^2?  •  •  •  i  ^  (not  necessarily  all 
distinct). 

Example  3.  Let  A  denote  the  ternary  operation  which  returns  the  first  repeated 
value  of  its  three  arguments,  or  the  first  value  if  they  are  all  distinct. 

The  relation  R2  defined  in  Example  2  is  closed  under  A,  since  applying  the 
A  operation  to  any  3  elements  of  R2  yields  an  element  of  1^2  •  For  example, 

A((0, 1,  0),  (1,  2,  0),  (1,  2,  0))  =  (1,  2,  0)  G  i?2. 

The  relation  Ri  defined  in  Example  2  is  not  closed  under  A,  since  applying  the 
A  operation  to  the  last  3  elements  of  Ri  yields  a  tuple  which  is  not  an  element 

A((1.2),(0,l),(2,l))-(l,l)^iii. 

For  any  set  of  relations  U,  and  any  operation  0,  if  every  R  £  F  is  closed  under 
0,  then  we  shall  say  that  F  is  closed  under  0.  The  next  lemma  indicates  that 
the  property  of  being  closed  under  some  operation  is  preserved  by  each  of  the 
operations  on  relations  described  in  Section  2.2. 

Lemma  9,  Let  Ri  and  R2  be  relations  which  are  closed  under  0,  for  some 
operation  0. 

The  following  relations  are  also  closed  under  0; 

1.  the  Cartesian  product,  Ri  x  R2; 

2.  any  projection  of  R\  or  R2; 

3.  any  equality  selection  from  R\  or  R2. 

Proof.  Follows  immediately  from  the  definitions. 

We  shall  be  particularly  interested  in  operations  that  depend  on  more  than  one 
argument.  We  therefore  make  the  following  definition: 

Definition  10.  An  operation  0  :  D  is  called  essentially  unary  if  there 

exists  some  non-constant  unary  operation  f  :  D  D  and  some  i  in  the  range  1 
to  k,  such  that  0(di,  (i2, . . .,  dfc)  =  f(di)  for  all  di,  ^2, . . . ,  dk- 

Note  that  constant  functions  are  excluded  from  this  definition,  and  so  are  not 
essentially  unary. 
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3  Closure  and  Tractability 

In  this  Section  we  will  show  that  any  set  of  relations  which  is  only  closed  under 
essentially  unary  operations  will  give  rise  to  a  class  of  problems  which  is  NP- 
complete.  Assuming  that  P  is  not  equal  to  NP,  this  implies  that  any  tractable  set 
of  relations  must  be  closed  under  some  operation  which  is  not  essentially  unary. 
In  other  words,  this  algebraic  property  is  a  necessary  condition  for  tractability. 

Theorem  11.  For  any  finite  set  of  relations,  F,  over  a  finite  set  D,  either 

1.  CSP(/’)  is  NP-complete;  or 

2.  r  is  closed  under  some  operation  (g)  which  is  not  essentially  unary. 

Proof.  When  |Z)|  <  2,  then  we  may  assume  without  loss  of  generality  that 
^  Q  {0)  l}i  where  0  corresponds  to  the  Boolean  value  False  and  1  corresponds 
the  Boolean  value  True.  It  follows  that  the  class  of  problems  CSP(r)  corresponds 
to  the  Generalised  Satisfiability  problem  over  the  set  of  logical  relations 
r ,  as  defined  in  [16]  (see  also  [6]).  It  was  established  in  [16]  that  this  problem  is 
NP-complete  unless  one  of  the  following  conditions  holds: 

1.  Every  relation  in  P  contains  the  tuple  (0,  0, . . . ,  0); 

2.  Every  relation  in  P  contains  the  tuple  (1,1,...,!); 

3.  Every  relation  in  F  is  definable  by  a  formula  in  conjunctive  normal  form  in 
which  each  conjunct  has  at  most  one  negated  variable. 

4.  Every  relation  in  P  is  definable  by  a  formula  in  conjunctive  normal  form  in 
which  each  conjunct  has  at  most  one  unnegated  variable. 

5.  Every  relation  in  P  is  definable  by  a  formula  in  conjunctive  normal  form  in 
which  each  conjunct  contains  at  most  2  literals. 

6.  Every  relation  in  F  is  the  set  of  solutions  of  a  system  of  linear  equations  over 
the  finite  field  GF(2). 

It  is  straightforward  to  show  that  in  each  of  these  cases  F  is  closed  under  some 
operation  which  is  not  essentially  unary  (see  [8]  for  details).  Hence  the  result 
holds  when  \D\  =  2. 

For  larger  values  of  |i)|  we  proceed  by  induction.  Assume  that  |T>|  >  3  and 
the  result  holds  for  all  smaller  values  of  \D\.  Let  m  =  |i:>|(|jD|  -  1)  and  let 
n  —  \D\^.  Let  M  be  an  m  by  n  matrix  over  D  in  which  the  columns  consist  of 
all  possible  m-tuples  over  D  (in  some  order).  Let  Rq  be  the  relation  consisting  of 
all  the  tuples  occuring  as  rows  of  M.  The  only  condition  we  place  on  the  choice 
of  order  for  the  columns  of  Af  is  that  7^i^2{Ro)  =  where  is  the  binary 
inequality  relation  over  D,  as  defined  in  Example  1. 

We  now  construct  a  relation  Rq  which  is  the  ‘closest  approximation’  to  Rq 
that  we  can  obtain  from  the  relations  in  F  using  the  Cartesian  product,  equality 
selection  and  projection  operations: 

Rq  =  f|{i2  e{FU  D^)+  I  Rq  C  R}. 
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Since  this  is  a  finite  intersection,  and  intersection  is  a  special  case  of  join  we  have 
from  Lemma  5  that  Rq  G  {F  U  D^)'^ .  In  other  words,  the  relation  Rq  may  be 
obtained  as  a  derived  constraint  relation  in  some  problem  belonging  to  CSP(r). 
There  are  now  two  cases  to  consider: 

1.  If  there  exists  some  tuple  to  €  Rq  with  to[l]  =  ^o[2],  then  we  will  construct, 
using  toi  appropriate  operation  under  which  F  is  closed. 

Define  the  function  (g)  :  D  hy  setting  (gi(di,  ^2,  •  •  • ,  dm)  ~  to[j]  where 

j  is  the  unique  column  of  M  corresponding  to  the  m- tuple  (di,  <i2>  •  •  m  dm)- 
We  will  show  that  F  is  closed  under  0. 

Choose  any  R  e  F,  and  let  p  be  the  arity  of  R.  We  are  required  to  show 
that  R  is  closed  under  (g).  Consider  any  sequence  of  tuples  of 

R  (not  necessarily  distinct),  and  for  i  —  l,2,...,p,  let  Ci  be  the  m-tuple 
•  •  -jtmH)-  For  each  pair  of  indices,  i,y,  such  that  Cj  —  Cj,  apply 
the  equality  selection  to  R^  to  obtain  a  new  relation 
Now  choose  a  maximal  set  of  indices,  I  —  {h,  ^2)  •  •  ♦  i  *<}?  such  that  the 
corresponding  Ci  are  all  distinct,  and  construct  the  relation  R”  =  7rj(i?^)  x 
Finally,  permute  the  coordinate  positions  of  R'*  (by  a  sequence  of 
Cartesian  product,  equality  selection,  and  projection  operations),  such  that 
D  Rq  (this  is  always  possible,  by  the  construction  of  Rq  and  R"),  Since 
R"  G  (r  U  X>^)+,  we  know  that  to  is  a  tuple  of  R” ^  by  the  definition  of  Rq. 
Hence  the  appropriate  projection  of  to  is  element  of  i2,  and  R  is  closed 
under  (g). 

If  {gi  is  essentially  unary,  then  let  /  :  D  D  be  the  corresponding  unary 
operation,  and  set 

f(D)  =  {/(</)  I  d  e  i)}; 

m  =  {{(/(di),  /(d2). . . . ,  f(dr))  I  (di, <i2, . . . , d,)  G  c}  I  c  e  r}. 

By  the  choice  of  to,  f  cannot  be  injective,  so  \f{D)\  <  |D|.  By  the  inductive 
hypothesis,  we  know  that  either  CSP(/(r))  is  NP-complete  (in  which  case 
CSP(r)  must  also  be  NP-complete)  or  else  f{F)  is  closed  under  some  oper¬ 
ation  (g)  which  is  not  essentially  unary  (in  which  case  F  is  closed  under  the 
operation  /(g),  which  is  also  not  essentially  unary).  Hence,  the  result  follows 
by  induction  in  this  case. 

2.  Alternatively,  if  Rq  contains  no  tuple  t  such  that  t[l]  ~  <[2],  then  7ri^2(J^o)  = 

so  €  {F  UD^)+.  But  this  implies  that  CSP({^d})  is  reducible  to 
CSP(r'),  since  every  occurence  of  the  constraint  relation  may  be  replaced 
with  an  equivalent  collection  of  constraints  with  relations  chosen  from  F. 
However,  it  was  pointed  out  in  Example  1  that  CSP({7«^d})  corresponds 
to  the  Graph  |D|-Colorability  problem  [6],  which  is  NP-complete  when 
ID  I  >  3.  Hence,  this  implies  that  CSP(r)  is  NP-complete,  and  the  result 
holds  in  this  case  also. 
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We  may  sharpen  this  result  a  little  further  by  bounding  the  possible  arity  of 
the  closure  operations  on  F  which  we  need  to  consider,  as  the  next  result  shows. 

Theorem  12,  For  any  set  of  relations  F  over  a  finite  set  D,  if  F  is  closed 
under  some  operation  0  which  is  not  essentially  unary,  then  it  is  also  closed 
under  some  operation  (§)  of  arity  at  most  max{3,  \D\},  which  is  not  essentially 
unary. 


Proof.  (This  proof  is  adapted  from  the  proof  of  Lemma  1.14  in  [17]).  Let  T  be  a 
set  of  relations  which  is  closed  under  some  operation  0  which  is  not  essentially 
unary.  Let  0  be  an  operation  of  the  smallest  possible  arity,  such  that  F  is  closed 
under  0  and  0  is  not  essentially  unary,  and  let  k  be  the  arity  of  0.  If  A;  <  3, 
then  the  result  holds,  so  we  only  need  to  consider  the  case  when  A;  >  4. 

Now  consider  the  operations  which  are  obtained  from  0  by  identifying  two 
arguments.  Since  F  is  closed  under  these  operations,  and  they  have  a  lower  arity 
than  0,  they  must  all  be  essentially  unary,  by  the  choice  of  0. 

Hence,  if  we  identify  the  first  two  arguments  we  have 


0(a;i,  xi,  xs,  *4,  - . . ,  Xk)  =  /i2(®t) 

for  some  non-constant  unary  operation  fi2  and  some  i  G  {1,  2, . . .,  A;}.  Similarly, 
if  we  identify  the  third  and  fourth  arguments  we  have 


<§>(«!,  X2,  Xz,  *3,  *5,  .  .  .,  ajfe)  =  f34(Xj) 

for  some  non-constant  unary  operation  fz4  and  some  j  G  {1,  2, . . . ,  A;}.  This 
means  that 

^{xi,  Xi,  Xs,  Xz,  Xs,  .  .  .  ,  Xk)  =  fl2(Xi)  =  fz4^j) 
for  all  possible  choices  of  xi,  xz,  x^,  xq,  . . . ,  Xk,  which  implies  that  either  i  ^  {1,2} 
^  {3,  4}.  It  follows  from  this  that  we  can  permute  the  order  of  the  arguments 
of  0  to  obtain  a  function  0  which  satisfies  the  identity 


for  some  non-constant  unary  operation  f.  In  particular,  we  have  0(xi,  y,  y,...,y)  = 
f{xi).  Now,  for  all  distinct  pairs  of  indices  t,  j  in  {2,  3  . . .,  A;},  we  know  that  the 
k  —  1-ary  function 


0(®1,  ®2,  .  .  .  ,  Xi_i,  X,  Xi  +  i,  .  .  .  ,  Xj-i,  X,  Xj^i,  ...,Xk) 

is  an  essentially  unary  function,  and  since  we  have  just  shown  that  for  one 
particular  choice  of  aJ2,  •  •  • ,  it  equals  /(«i),  we  know  that  it  must  equal 
f{xi)  in  all  cases. 

Similarly,  (using  the  fact  that  ^  ^  4),  for  all  indices  2G{2,3,...,A5},  we  can 
establish  that 

0(a;,  052, ,  it-i,  X,  Xi^i,  ...,xk)  =  f(x). 

Now,  if  A;  >  \D\  then  there  are  more  arguments  than  values,  so  at  least  one 
argument  value  must  be  repeated  somewhere,  and  in  that  case  we  would  have 
0(a3i ,  *2, . . . ,  ajfc)  =  f[xi).  Since  0  is  just  0  with  the  order  of  the  arguments 
permuted,  we  have  contradicted  the  fact  that  0  is  not  essentially  unary.  This 
means  that  we  must  have  k  <  \D\,  and  the  result  follows. 
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4  A  Test  for  Tract  ability 

Let  r"  be  a  set  of  relations  over  a  set  D.  Note  that  the  operations  under  which 
r  is  closed  are  simply  mappings  from  to  D,  for  some  which  satisfy  cer¬ 
tain  constraints,  as  described  in  Definition  8.  In  this  Section  we  show  that  it  is 
possible  to  identify  these  operations  by  solving  a  single  constraint  satisfaction 
problem.  In  fact,  we  shall  show  that  these  closure  operations  are  precisely  the 
solutions  to  a  constraint  satisfaction  problem  of  the  following  form. 

Definition  13.  Let  T  be  a  set  of  relations  over  a  finite  domain  D. 

For  any  natural  number  m  >  0,  the  indicator  problem  for  F  of  order  m  is 
defined  to  be  the  constraint  satisfaction  problem  X'P{r^  m)  with 

-  Set  of  variables  D'^ ; 

-  Domain  of  values  D; 

-  Set  of  constraints  {Ci,  C2,  . . . ,  C^},  such  that  for  each  Re  F,  and  for  each  se¬ 
quence  ti,  <2)  •  •  • )  of  tuples  from  R,  there  is  a  constraint  Ci  —  (5,,  R)  with 
Si  -  (ui,  i;2, . . . ,  Vn)  where  n  is  the  arity  of  i?  and  Vj  —  {ti[j],t2[j],  -  • 

Example  4-  Consider  the  relation  Ri  over  D  —  {0,  1,  2},  defined  in  Example  2. 

The  indicator  problem  for  {Ri}  of  order  1,  XV ({Ri}^  1),  has  3  variables  and 
4  constraints.  The  set  of  variables  is 

{<0),(1),(2)}, 

and  the  set  of  constraints  is 

{(((0),{0)),f?x), 

(((o),(i»,iei), 

(((l>.(2».«i). 

(((2),(l»,JJi)  }. 

The  indicator  problem  for  {i?i}  of  order  2,  XV ({Ri}^  2),  has  9  variables  and 
16  constraints.  The  set  of  variables  is 

{(0,  0),  (0, 1),  (0, 2),  <1,  0),  (1, 1>,  (1, 2),  (2, 0),  (2, 1),  (2, 2)}, 

and  the  set  of  constraints  is 

{{((0,0>,{0,0)),JJi).(((0.0),(0, 
(((0,0>,(l,0)),iii).(((0,0),(l,l)),iii), 
(((0,l),(0,2)),ili),(((0.1),(l,2»,JJi), 
(({0,2).(0,l)),iii),(((0.2>,{l,l)),Jli), 
(((l,0),(2,0)),ili),(({l,0>,{2,  l)),ili), 
(((l,l>,(2,2)).iJi),(((l,2),{2, 

(((2,0>,(l,0)),fli),  (((2,0),(l,l)),iii), 

(({2,l>,(l,2»,JJi),  (((2,2>,(l,l)),iJi)  }. 
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Example  5.  Consider  the  relation  i?2  over  7)  =  {0,  1,  2},  defined  in  Example  2. 

The  indicator  problem  for  {R2}  of  order  1,  IV{{R2},  1),  has  3  variables  and 
3  constraints.  The  set  of  variables  is 

{(0),(1>,(2)}, 

and  the  set  of  constraints  is 

{{((0>,(l>,(2)),iJ2), 

(((l>,(2),(0)),il2), 

(((2),(0>,(l)),it2)  }. 

The  indicator  problem  for  {ils}  of  order  2,  X7>{{R2},  2),  has  9  variables  and  9 
constraints.  The  set  of  variables  is 

{{0, 0),  (0, 1).  (0,  2),  (1, 0),  (1, 1),  (1, 2),  (2,  0),  (2, 1),  (2, 2>}, 

and  the  set  of  constraints  is 

{(((0.0),  (1,1),  {2,2)),  R,), 

(((0,l),(1.2),(2,0)),Ji2), 

(((0.2),(1,0),(2,1)),JJ2), 

(((l,0),(2,l),(0,2)),il2), 

(((1,1),(2,2),(0,  0)),i?2), 

(((l,2),(2,0),(0.1)),ii2), 

(((2,0),(0,  l),(l,2)),ie2), 

(((2,l),(0.2),(l,0)),li2). 

(((2.2),(0,0),(l,l)),li2)  }. 

Example  6.  Consider  the  relations  Rx  and  R2  over  D  =  {0,1,2},  defined  in 
Example  2. 

The  indicator  problem  for  {ili,il2}  of  order  1,  X'P[{Ri,R2},  1),  has  3  vari¬ 
ables  and  7  constraints.  The  set  of  variables  is 

{(0),(1),(2)}, 

and  the  set  of  constraints  is  equal  to  the  union  of  the  set  of  constraints  of 
1),  as  defined  in  Example  4,  and  the  set  of  constraints  of  XP({i?2},  1), 
as  defined  in  Example  5. 

The  indicator  problem  for  {Ri,  R2}  of  order  2,  XV[{Ri,  i?2},  2),  has  9  vari¬ 
ables  and  25  constraints.  The  set  of  variables  is 

{(0, 0),  (0, 1),  (0,  2),  (1, 0),  (1, 1),  (1, 2),  (2,  0),  (2, 1),  (2, 2)}, 

and  the  set  of  constraints  is  equal  to  the  union  of  the  set  of  constraints  of 
XV{{Ri}^  2),  as  defined  in  Example  4,  and  the  set  of  constraints  ofXV({R2}^  2), 
as  defined  in  Example  5. 
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Solutions  to  the  indicator  problem  for  F  of  order  m  are  functions  from  to  D, 
or  in  other  words,  m-ary  operations  on  D.  We  now  show  that  they  are  precisely 
the  m-ary  operations  under  which  F  is  closed. 

Theorem  14*  For  any  set  of  relations  F  over  domain  D,  the  set  of  solutions  to 
XV{F^m)  is  equal  to  the  set  of  m-ary  operations  under  which  F  is  closed. 

Proof.  By  Definition  8,  we  know  that  F  is  closed  under  the  m-ary  operation  (g) 
if  and  only  if  (8)  satisfies  the  condition  (8)(ti,  ^2)  •  •  •  1  ^m)  ^  R  for  each  possible 
choice  of  R  £  F  and  -  ^  R  necessarily  all  distinct).  But  this  is 

equivalent  to  saying  that  (8)  satisfies  all  the  constraints  in  XV{F,  m),  so  the  result 
follows. 

Example  7.  Consider  the  relation  jRi  over  D  —  {0,  1,  2},  defined  in  Example  2. 

The  indicator  problem  for  {Ri}  of  order  1,  defined  in  Example  4,  has  2 
solutions,  which  may  be  expressed  in  tabular  form  as  follows: 


Variables 

(0) 

<i> 

<2> 

Solution  1 

0 

0 

0 

Solution  2 

0 

1 

2 

One  of  these  solutions  is  a  constant  operation,  so  CSP({i2i})  is  tractable,  by 
Proposition  9  of  [9].  In  fact,  any  problem  in  CSP({i^i})  has  the  solution  which 
assigns  the  value  0  to  each  variable,  so  this  class  of  problems  is  trivial. 

The  indicator  problem  for  {iii}  of  order  2,  defined  in  Example  4,  has  4 
solutions,  which  may  be  expressed  in  tabular  form  as  follows: 


Variables 

(0,0) 

(0,1) 

(0.2) 

(1-0) 

(1,1) 

(1,2) 

(2,0) 

(2,1) 

(2,2) 

Solution  1 

0 

0 

0 

0 

0 

0 

0 

0 

0 

Solution  2 

0 

1 

2 

0 

1 

2 

0 

1 

2 

Solution  3 

0 

0 

0 

1 

1 

1 

2  ; 

2 

2 

Solution  4 

0 

0 

0 

0 

1 

0 

0  ; 

0 

2 

The  first  of  these  solutions  is  a  constant  operation,  and  the  second  and  third  are 
essentially  unary  operations.  However,  the  fourth  solution  shown  in  the  table 
is  more  interesting.  It  is  easily  checked  that  this  operation  is  an  associative, 
commutative,  idempotent  (ACI)  binary  operation,  so  we  have  a  second  proof  that 
CSP({i2i})  is  tractable,  by  Theorem  16  of  [9].  Furthermore,  this  result  shows 
that  Ri  may  be  combined  with  any  other  relations  (of  any  arity)  which  are  also 
closed  under  this  ACI  operation  to  obtain  larger  tractable  problem  classes. 

Example  S.  Consider  the  relation  R2  over  D  =  {0,  1,  2},  defined  in  Example  2. 

The  indicator  problem  for  {i22}  of  order  1,  defined  in  Example  5,  has  3 
solutions,  which  may  be  expressed  in  tabular  form  as  follows: 
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Variables 

(0) 

(i> 

(2) 

Solution  1 

0 

1 

2 

Solution  2 

1 

2 

0 

Solution  3 

2 

0 

1 

The  indicator  problem  for  {R2}  of  order  3,  has  a  very  large  number  of  solu¬ 
tions,  including  the  operation  A,  defined  in  Example  3.  Hence,  we  know  that 
CSP(r)  is  tractable  by  Theorem  13  of  [9].  Furthermore,  R2  may  be  combined 
with  any  other  relations  which  are  also  closed  under  this  operation  to  obtain 
larger  tractable  problem  classes. 

In  order  to  draw  conclusions  about  the  tractability  of  a  set  of  constraints  we 
need  to  be  able  to  distinguish  closure  operations  which  are  essentially  unary 
from  those  which  are  not.  By  Definition  10,  this  means  checking  whether  there 
exists  some  coordinate  position  which  has  the  property  that  when  the  input  value 
in  that  position  is  constant,  then  the  value  of  the  operation  remains  constant. 
This  checking  may  be  carried  out  in  m\D\^  steps. 

Alternatively,  the  next  result  shows  that  the  number  of  solutions  to  certain  in¬ 
dicator  problems  provides  a  sufficient  condition  for  establishing  NP-completeness, 
without  needing  to  examine  the  individual  solutions  in  detail. 

Corollary  15,  Let  F  be  a  set  of  relations  over  domain  D,  let  Si  be  the  set 
of  non-constant  solutions  to  TV{r,  1),  and  let  Sm  be  the  set  of  solutions  to 
IV(r,  m),  where  m  =  max{3,  |D|}. 

If  \Sm  \  <  ’^l'5'il;  then  CSP(/^)  is  NP-complete. 

Proof  By  Theorem  11  and  Theorem  12,  we  know  that  either  CSP(jr)  is  NP- 
complete,  or  F  is  closed  under  some  operation  with  arity  at  most  m  which  is  not 
essentially  unary. 

By  Theorem  14,  Sm  is  the  set  of  all  m-ary  operations  under  which  F  is  closed. 
By  Definition  10,  the  number  of  these  which  are  essentially  unary  is  equal  to  the 
number  of  non-constant  unary  operations  under  which  F  is  closed,  multiplied 
by  m.  By  Theorem  14,  this  number  is  Hence,  |5m|  >  m|5i|  in  all  cases. 

In  the  limiting  case,  when  |iSVn.|  =  we  know  that  every  element  of  Sm 

is  essentially  unary,  so  F  is  not  closed  under  any  m-ary  operation  which  is  not 
essentially  unary.  It  follows  that  F  is  not  closed  under  any  operation  of  arity 
lower  than  m  which  is  not  essentially  unary,  so  CSP(r)  must  be  NP-complete. 

When  |D|  =  2,  the  converse  result  also  holds  [8].  Hence  solving  the  indica¬ 
tor  problems  of  order  1  and  order  3  provides  a  simple  and  complete  test  for 
tractability  of  any  set  of  relations  over  a  domain  with  2  elements.  This  answers 
a  question  posed  by  Schaefer  in  1978  [16]  concerning  the  existence  of  an  efficient 
test  for  tractability  in  the  GENERALISED  SATISFIABILITY  problem.  Note  that 
carrying  out  the  test  requires  finding  the  number  of  solutions  to  a  constraint 
satisfaction  problem  with  just  8  Boolean  variables. 
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For  larger  domains,  Corollary  15  establishes  NP-completeness  for  many  sets 
of  constraints  without  the  need  for  individually  constructed  reduction  arguments, 
as  the  following  examples  illustrate. 

Example  9.  Consider  the  relations  Ri  and  i?2  over  D  =:  {0,1,2},  defined  in 
Example  2. 

The  indicator  problem  for  {Ri,  R2}  of  order  1,  defined  in  Example  6,  has  1 
solution,  corresponding  to  the  identity  operation. 

The  indicator  problem  for  {ili,  R2}  of  order  3,  has  3  solutions  (which  are  all 
essentially  unary). 

Hence  CSP({i?i,  722})  is  NP-complete,  by  Corollary  15. 

Example  10.  Consider  the  relations  72i,  R2,  Rs  and  7^4  over  D  =  {0, 1,  2},  defined 
in  Example  2. 

By  counting  the  solutions  to  the  indicator  problems  of  order  1  and  order  3 
for  each  relation  and  each  pair  of  distinct  relations  in  this  set,  we  are  able  to 
complete  the  analysis  of  the  complexity  of  CSP(r)  for  each  possible  subset  E  of 
these  relations. 


Relations 

E 

Solutions 
XV{E,  1) 
(non-constant) 

^  Solutions 

iT(r,  3) 

Complexity 
of  CSP(r) 

1 

12 

Trivial  (Example  7) 

{R2} 

3 

>9 

Polynomial  (Example  8) 

{Rs} 

10 

>30 

Trivial 

{R^} 

6 

18 

NP-complete  (Example  1) 

{RuR2} 

1 

3 

NP-complete  (Corollary  15) 

{72i,  723} 

1 

3 

NP-complete  (Corollary  15) 

{72i,  724} 

1 

3 

NP-complete  (Example  1) 

{722,  723} 

1 

3 

NP-complete  (Corollary  15) 

{722,  724} 

1 

3 

NP-complete  (Example  1) 

{723,  -^4} 

2 

6 

NP-complete  (Example  1) 

For  all  larger  sets  of  relations  E  C  {72i,  7?2i  7^3,  7^4},  we  have  that  E  contains 
at  least  one  of  the  pairs  of  relations  shown  in  the  table,  and  so  CSP(7^)  is  NP- 
complete. 

How  practical  is  the  test  proposed  here  in  general?  For  any  set  of  relations 
E  over  domain  D,  the  indicator  problem  of  order  m  has  \D\^  variables  and 
Ylner  constraints.  Hence,  for  small  values  of  \D\  the  size  of  the  relevant 
indicator  problems  is  very  small,  and  the  solutions  may  be  found  easily.  This 
remains  true  even  when  the  arity  of  the  relations  in  E  is  large. 

As  the  domain  size  increases,  the  size  of  the  indicator  problems  increases 
rapidly,  and  it  becomes  impractical  to  compute  all  solutions.  On  the  other  hand, 
for  cases  of  interest,  it  may  be  possible  to  establish  from  known  properties  of 
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the  constraints  that  the  relevant  indicator  problems  will  have  particular  types 
of  solution,  without  carrying  out  a  complete  solution  algorithm.  This  question 
is  currently  being  investigated. 

5  Conclusion 

In  this  paper  we  have  shown  how  the  algebraic  properties  of  relations  may  be 
used  to  distinguish  between  sets  of  relations  which  give  rise  to  tractable  con¬ 
straint  satisfaction  problems  and  those  which  give  rise  to  NP-complete  classes 
of  problems. 

In  particular,  we  have  established  that  any  set  of  relations  which  does  not 
give  rise  to  an  NP-complete  class  of  constraint  satisfaction  problems  must  be 
closed  under  some  operation  which  is  not  essentially  unary. 

Furthermore,  we  have  proposed  a  method  for  determining  the  operations 
under  which  a  set  of  relations  is  closed  by  solving  a  particular  form  of  constraint 
satisfaction  problem,  which  we  have  called  an  indicator  problem. 

For  problems  where  the  domain  contains  just  two  elements  these  results  pro¬ 
vide  a  necessary  and  sufficient  condition  for  tractability  (assuming  that  P  is  not 
equal  to  NP),  and  an  efficient  test  to  distinguish  the  tractable  sets  of  relations. 

For  problems  with  larger  domains  the  closure  condition  we  have  described  is 
a  necessary  condition  for  tractability.  The  converse  of  this  condition  provides  a 
sufficient  condition  for  NP-completeness,  which  we  have  shown  is  widely  appli¬ 
cable  and  easy  to  test. 

We  are  now  investigating  the  application  of  these  results  to  particular  prob¬ 
lem  types,  such  as  temporal  problems  involving  subsets  of  the  interval  algebra. 
We  are  also  attempting  to  determine  how  the  presence  of  particular  algebraic 
closure  properties  in  the  constraints  may  be  used  to  derive  appropriate  efficient 
algorithms  for  tractable  problem  classes. 
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Abstract.  In  a  recent  paper^,  the  concept  of  “free  amalgamation”  has 
been  introduced  as  a  general  methodology  for  interweaving  solution 
structures  for  symbolic  constraints,  and  it  was  shoum  how  constraint  sol¬ 
vers  for  two  components  can  be  lifted  to  a  constraint  solver  for  the  free 
amalgam.  Here  we  discuss  a  second  general  way  for  combining  solution 
domains,  called  rational  amalgamation.  In  praxis,  rational  amalgama¬ 
tion  seems  to  be  the  preferred  combination  principle  if  the  two  solution 
structures  to  be  combined  are  “rational”  or  “non-wellfounded”  domains. 
It  represents,  e.g.,  the  way  how  rational  trees  and  rational  lists  axe  inter¬ 
woven  in  the  solution  domain  of  Prolog  III,  and  a  variant  has  been  used 
by  W.  Rxjunds  for  combining  feature  structures  and  hereditarily  finite 
non-wellfounded  sets.  We  show  that  rational  amalgamation  is  a  general 
combination  principle,  applicable  to  a  large  class  of  structures.  As  in  the 
case  of  free  amalgamation,  constraint  solvers  for  two  component  struc¬ 
tures  can  be  combined  to  a  constraint  solver  for  their  rational  amalgam. 
From  this  algorithmic  point  of  view,  rational  amalgamation  seems  to  be 
interesting  since  the  combination  technique  for  rational  amalgamation 
avoids  one  source  of  non- determinism  that  is  needed  in  the  correspon¬ 
ding  scheme  for  free  amalgamation. 


1  Introduction 

One  idea  behind  constraint  solving  is  to  use  specialized  formalisms  and  inference 
mechanisms  to  solve  domain-specific  tasks.  In  many  applications,  however,  one 
is  faced  with  a  complex  combination  of  different  problems,  which  means  that  a 
system  tailored  to  solving  a  single  problem  can  only  be  applied  if  it  is  possible 
to  combine  it  with  other  specialized  systems.  The  present  paper,  as  its  predeces¬ 
sor  [BS95],  marks  one  step  in  a  program  where  we  try  to  characterize  the  most 
important  general  constructions  for  combining  solution  domains  and  constraint 
solvers  for  symbolic  constraints.  A  general  combination  method,  in  our  sense, 
has  to  give  answers  to  two  problems.  First,  it  must  offer  a  general  construc¬ 
tion  for  combining  two  solution  domains.  Second,  a  combination  algorithm  has 

*  This  work  was  supported  by  a  DFG  grant  (SSP  “Deduktion”)  and  by  the  EC  Working 
Group  CCL,  EP6028. 

^  see  [BS95]. 
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to  be  given  that  reduces  the  problem  of  solving  “mixed”  constraints  over  the 
combined  solution  domain  to  the  problem  of  solving  “pure”  constraints  over  the 
two  component  structures.  We  think  that  it  is  in  fact  possible  to  characterize 
a  small  set  of  fundamental  combination  methods  that  describe — modulo  minor 
deviations — all  known  instances  of  combined  symbolic  constraint  systems  in  this 
sense. 

In  [BS95]  the  notion  of  the  free  amalgamated  product  of  two  component  struc¬ 
tures  was  introduced.  This  product  is  characterized  by  a  universality-property: 
it  represents  a  most  general  object  among  all  structures  that  can  be  conside¬ 
red  as  a  reasonable  combination  of  the  two  components.  For  SC-structures  over 
disjoint  signatures  an  explicit  construction  of  the  free  amalgamated  product  of 
two  components  was  given  and  it  was  shown  how  given  constraint  solvers  for 
the  component  structures  can  be  combined  to  a  constraint  solver  for  the  free 
amalgam. 

In  the  present  paper  we  introduce  a  second  systematic  way  to  combine  cons¬ 
traint  systems  over  SC-structures,  called  rational  amalgamation.  Free  and  ratio¬ 
nal  cunalgamation  both  yield  a  combined  structure  wdth  “mixed”  elements  that 
interw'eave  a  finite  number  of  “pure”  elements  of  the  tw’o  components  in  a  parti¬ 
cular  way.  The  difference  between  both  constructions  becomes  transparent  when 
w'e  ignore  the  interior  structure  of  these  pure  subelements  and  consider  them  as 
construction  units  wdth  a  fixed  arity,  similar  to  “complex  function  symbols”. 
Under  this  perspective,  and  ignoring  details,  mixed  elements  of  the  free  amal¬ 
gam  can  be  considered  as  finite  trees,  whereas  mixed  elements  of  the  rational 
amalgam  are  like  rational  trees. 

Mixed  clement  of  free  amalgam  (1 )  and  of  rational  amalgam  (2). 


On  this  background  it  should  not  be  surprising  that  in  praxis  rational  amalga¬ 
mation  appears  to  be  the  preferred  combination  principle  in  situations  where 
the  two  solution  structures  to  be  combined  are  themselves  “rational”  or  “cy¬ 
clic”  domains:  for  example,  it  represents  the  way  how  rational  trees  and  rational 
lists  are  interwoven  in  the  solution  domain  of  Prolog  III  ([Co90]),  and  a  vari¬ 
ant  of  rational  amalgamation  has  been  used  to  combine  feature  structures  with 
non-wellfounded  sets  in  a  system  introduced  by  W.  Rounds  [Ro88]. 

We  introduce  rational  amalgamation  as  a  general  construction  that  can  be 
used  to  combine  so-called  non-collapsing  SC-structures  over  disjoint  signatures. 
It  is  then  shown  how  constraint  solving  in  the  rational  amalgam  can  be  reduced 
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to  constraint  solving  in  the  components.  The  decomposition  scheme  that  is  used 
is  closely  related  to  the  decomposition  algorithm  for  free  amalgamation,  but  it 
avoids  one  highly  non- deterministic  step  that  is  needed  in  the  latter  scheme. 
Hence,  when  matters  of  efficiency  become  important,  rational  amalgamation 
might  be  the  better  choice. 

Let  us  now  briefly  indicate  which  insights  could  be  gained  from  a  classification 
of  the  basic  methodologies  for  combining  constraints  systems.  Below  we  shall 
summarize  -what  has  been  obtained  so  far. 

1.  It  helps  to  understand  the  scale  of  possibilities  and  the  general  limitations 
for  combining  constraints  systems. 

2.  It  might  facilitate  the  design  of  new  combined  constraint  systems,  and  it 
helps  to  understand  existing  instances  of  combination  from  a  general  point 
of  view. 

3.  It  establishes  new  and  interesting  connections  between  the  theory  of  cons¬ 
traint  solving  and  other  areas  such  as,  e.g.,  universal  algebra  and  logic. 

4.  The  relationship  between  different  methodologies  for  combining  constraint 
systems  is  interesting  per  se,  we  hope  to  verify. 

1.  From  our  present  perspective,  which  is  explained  in  the  conclusion,  free  and 
rational  amalgamation,  and  a  related  construction  called  “infinite  amalgama¬ 
tion”  seem  to  be  the  most  important  combination  principles  in  a  spectrum  of 
related  methods.  Furthermore,  we  are  confident  that  the  abstract  definition  of 
an  SC-structure,  as  introduced  in  [BS95]  and  iised  here,  captures  a  maximal 
class  of  (unsorted!)  structures  where  these  combination  principles  can  be  app¬ 
lied  in  a  uniform  way.  This  class  covers  most  of  the  non-numerical  and  non-finite 
solution  domains  that  are  used  in  constraint  solving.  All  the  solution  domains 
that  are  considered  in  the  area  of  unification  modulo  equational  theories  are 
SC-structures.  Furthermore,  the  algebra  of  rational  trees,  feature  structures, 
and  structures  with  finite  or  rational  nested  sets,  lists  and  multisets  are  SC- 
structures. 

2.  The  techniques  that  are  described  in  this  paper  show,  e.g.,  that  there  is  a 
common  and  general  methodology  behind  Colmerauer’s  combination  of  ratio¬ 
nal  trees  and  rational  lists  in  the  solution  domain  of  Prolog  III  ([Co90])  and 
Rounds’  [Ro88]  combination  of  feature  structures  with  non-wellfounded  sets. 
The  abstract  techniques  described  in  the  present  paper  can  be  used,  e.g.,  to 
obtain  similar  combinations  that  mix  rational  trees,  feature  structures,  rational 
lists,  nested  multi-sets,  or  quotient  term  algebras  for  collapse-free  equational 
theories  over  disjoint  signatures  in  cirbitrary  manner. 

3.  The  definition  of  an  SC-structure  properly  generalizes  the  notion  of  a  free 
structure.  Still,  SC-structures  have  what  is  sometimes  called  the  “unique  map¬ 
ping  property”  of  free  structures,  and  a  major  part  of  the  theory  of  free  structures 
as  developed  in  imiversal  algebra  can  be  lifted  to  the  case  of  SC-structures.  A 
detained  (mathematical)  investigation  of  this  point  is  in  progress.  Furthermore, 
it  has  turned  out  that  the  methods  for  combining  solution  domains  developed 
in  [BS95]  and  here,  and  the  general  methods  for  combining  logics  described  by 
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Gabbay  [Ga96]  and  Pfalzgraf  [Pf91,  PS94]  follow  the  very  same  abstract  idea. 
See  [Ga96]  for  a  first  discussion  of  this  issue. 

The  conclusion  w'ill  be  used  to  comment  on  item  4.  For  lack  of  space  we  had 
to  omit  proofs.  All  proofs  can  be  found  in  the  long  version  [KS96]  of  the  paper. 


2  Preliminaries 

A  signature  E  consists  of  a  set  Sf  of  fimction  symbols  and  a  disjoint  set  Ep  of 
predicate  symbols  (not  containing  each  of  fixed  arity.  The  signatures  con¬ 
sidered  in  this  paper  may  be  finite  or  countably  infinite.  Expressions  denote 
i7-structures  over  the  carrier  set  A,  and  {pX)  stands  for  the  interpretation  of 
f  £  Ep  {p  E  Ep)  in  i^-terms  (t,  fi, . . .)  and  atomic  i7-formulas  (of  the  form 
ti  =  *2,  or  of  the  form  p(ti, . . . ,  t„))  are  built  as  usual.  A  X'-formula  ip  is  written 
in  the  form  ip{vi,. . . ,  Un)  in  order  to  indicate  that  the  set  Var{p)  of  free  varia¬ 
bles  of  V?  is  a  subset  of  {ui,...  ,u„}.  We  write  [=  <p(t’i/ai, •  •  •  ^Vn/o-n)  if  P 
becomes  true  in  A^  under  all  assignments  that  map  r,*  to  a,  €  A,  for  1  <  i  <  n. 

i7-homomorphisms,  X'-isomorphisms,  and  X-endomorphisms  are  defined  as 
usual,  see  e.g.  [Ma71,  BS95].  With  EndJ  we  denote  the  monoid  of  all  endomor- 
phisms  of  with  composition  as  operation.  If  g  :  A  B  and  h  :  B  C 
are  mappings,  then  g  o  h  :  A  -¥  C  denotes  their  composition.  Expressions  like 
v^a  are  used  to  denote  finite  sequences.  If  a  =  ai, . . . ,  Cn  is  a  sequence  of  ele¬ 
ments  of  A  and  if  m  is  a  mapping  with  domain  A,  then  m(d)  denotes  the  se¬ 
quence  m(ai), . . . , m(a„).  If  iT  =  ui, . . . , Un,  then  A^  is  shorthand  for 

A^  1=  p{vi/ai,.. .  ,Vn/an)>  The  symbol  “!+)”  denotes  disjoint  set  imion. 

3  Non-collapsing  SC-structures 

In  this  section  we  introduce  the  class  of  structures  for  which  we  can  use  the 
rational  amalgamation  construction  (Definition  7).  First  we  recall  the  definition 
of  SC-structures  given  in  [BS95].  We  consider  a  fixed  X-structure  A^,  and  M 
always  denotes  a  submonoid  of  EndJ. 

Definition  1.  Let  Ao,Ai  be  subsets  of  A^.  Then  Aq  stabilizes  Ai  with  respect 
to  M  iff  all  elements  mi  and  m2  of  M  that  coincide  on  Ao  also  coincide  on  Ai. 
For  Ao  C  A  the  stable  hull  of  Aq  with  respect  to  M  is  the  set 

Sfl^(Ao)  :=  {a  €  A  I  Ao  stabilizes  {a}  wdth  respect  to  M}. 

SH^(Ao)  is  always  a  X-substructure  of  A^,  and  Ao  C  SH'^{Ao).  The  stable 
hull  of  Ao  can  be  larger  than  the  X-subalgebra  generated  by  Ao. 

Definition  2.  The  set  X  C  A  is  an  M-atom  set  for  A^  if  every  mapping  X  A 
can  be  extended  to  an  endomorphism  in  M, 
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IDofinition  3.  A  countably  infiniteX'-structure  is  an  SC-structuve  (simply 
combinable  structure)  iff  there  exists  a  submonoid  M  of  End^  such  that  has 
an  infinite  wM-atom  set  X  where  every  a  €  A  is  stabilized  by  a  finite  subset  of  X 
with  respect  to  M,  We  denote  this  SC-structure  by  X),  If  M  =  End^, 

then  {A^,End^,X)  is  called  a  strong  SC-structuve. 

Example  1.  The  class  of  SC-structures  contains,  e.g.,  all  free  structures  (see,  e.g., 
[Ma71]),  rational  tree  algebras  ([Co84,  Ma88]),  feature  structures  (for  specificity, 
we  refer  to  [AP94]),  feature  structures  \^ith  arity  ([ST94,  BT94]),  domains  with 
nested,  finite  or  rational  lists  (rational  lists  are  used  in  Prolog  III,  see  [Co90]), 
and  domains  wdth  nested,  finite  or  rational  sets  (as  introduced  in  [Ac88]  and 
used  in  [Ro88]).^  In  each  case,  we  have  to  take  the  non-ground  variant  since  we 
assume  the  existence  of  a  countably  infinite  set  of  atoms.  With  the  exception  of 
feature  structures,  all  these  structures  are  strong  SC-structures.  For  details  we 
refer  to  [BS95]. 

In  the  rest  of  this  section,  {A^,M,X)  denotes  a  fixed  SC-structure. 

Lemma 4*  Let  . . , ujt)  be  a  positive  E-formula,  m  €  A4,  let  ai, ...  ,ak  6 

A.  Then  A^  [=  (fivi/ai,. . .  ,Vk/ak)  implies  A^  [=  ^{vi/m{ai),. . .  ,Vk/m{ak)). 

A  fundamental  property  of  SC-structures  is  the  following  ([BS95],  Lemma  13): 
for  each  a  6  A  there  exists  a  unique  minimal  finite  set  Y  C  X  such  that  a  E 

Definitions.  The  stabilizer  of  a  £  A  with  respect  to  A4,  Stab'^{a),  is  the 
unique  minimal  finite  subset  K  of  X  such  that  a  e  SH^{Y).  The  stabilizer  of 
A'  C  A  is  the  set  Stab^{A')  :=  Ua€A'  •5tab^(a). 

The  next  lemma  plays  a  crucial  role  in  the  rational  amalgamation  construction. 
It  is  used  in  many  proofs. 

Lemma  6.  Let  m  £  A4  be  an  endomorphism  of  the  SC-structure  (A^.,Xi,X) 
such  that  the  restriction  of  m  on  X  is  a  mapping  X  -¥  X.  If  Stab^{a)  = 
{xi, . . .  then  Stab‘^{m{a))  C  {m(a:i ),..., m(a:*r)}.  If  m  is  an  automor¬ 

phism,  then  Stab'^{m{a))  =  {m(a:i), . . .  ,m(a:jt)}. 

We  can  now  characterize  the  subclass  of  SC-structures  for  which  we  can  use  the 
rational  amalgamation  construction. 

Definition?.  An  SC-structure  {A^  ,M,X)  is  non-collapsing  if  every  endomor¬ 
phism  m£  M  maps  non-atoms  to  non-atoms  (i.e.,  m(a)  €  A\  A"  for  all  a  G  A\  A 
and  all  m  E  M). 

E.g.,  quotient  term  algebras  for  collapse-free  equational  theories  (see  [BS94]), 
rational  tree  algebras,  feature  structures,  feature  structures  with  arity,  and  the 
domains  with  nested,  finite  or  rational  lists  (as  mentioned  in  1)  axe  alw'ays  non¬ 
collapsing. 

^  The  signatures  of  these  structures  may  be  finite  or  countably  infinite,  because  each 
element  is  a  finite  or  rational  “tree”  and  hence  composed  using  a  finite  part  of  the 
signature  only.  This  guarantees  that  the  complete  structure  is  countably  infinite,  as 
demanded  in  Definition  3. 
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4  The  Domain  of  the  Rational  Amalgam 

The  definition  of  the  underlying  domain  is  the  most  complicated  step  of  the 
rational  amalgamation  construction.  The  description  would  be  much  simpler  if 
we  would  restrict  the  construction  to  components  where  the  elements  have  a 
particular  form  (e.g.,  the  form  of  trees).  But  such  a  restriction  would  contradict 
our  motivation  to  describe  a  general  construction.  We  shall  first  introduce  the 
notion  of  a  “braid”  and  its  standard  normal  form.  The  set  of  braids  in  stand^d 
normal  form  will  represent  the  carrier  of  the  rational  amalgam.  We  shall  describe 
the  rational  amalgamation  of  two  component  structures.  There  are,  however,  no 
difficulties  to  interweave  any  finite  number  of  components  in  the  same  way. 

Throughout  this  section  {A^ and  ,Y)  denote  two  fixed  non¬ 

collapsing  SC-structures  over  disjoint  signatures.  The  atom  sets  X  and  Y  have 
the  form  X  =  Z  ^  Oa  ^ridY  =  Z  \i}  Ob,  where  the  sets  Z,  Oa,  and  Ob  are  aU 
infinite,  and  where  H  =  0.  The  atoms  in  Z  will  be  called  bottom  atoms, 
the  atoms  in  Oa  {Ob)  will  be  called  open  atoms.  In  the  braid  construction,  the 
bottom  atoms  will  play  the  role  of  ordinary  atoms,  or  leaves.  Open  atoms,  in 
contrast,  are  only  used  to  connect  elements  of  both  structures.  With  OA{a)  and 
Oa{A')  we  denote  the  set  of  open  atoms  occurring  in  the  stabilizer  of  a  €  A 
{A'  C  A)  with  respect  to  M.  Similarly  expressions  {Ob{B'))  are  used  to 

denote  the  set  of  open  atoms  occurring  in  the  stabilizer  ofb  e  B  {B'  C  B)  with 
respect  to  A/*. 

Definition  8.  Let  O'^  C  Oa,  O'b  Q  Ob,  let  tt^  :  -4  B,  A,  let 

TT  :=  tta  u  ttb-  An  element  a  6  A  is  directly  linked  to  b  e  B  vib.  n  if  there  is  an 
o  e  OB{b)  such  that  a  =  ttbCo).  Analogously  6  €  B  is  directly  linked  to  a  6  A 
via  TT  if  there  exists  an  o  6  such  that  b  =  '^a{o)-  An  element  a  E  A  U  B 

is  a  n^descendant  of  b  e  AU  B  if  there  exists  a  sequence  a  =  ao,ai, . . .  ,an  =  ?> 
(n  >  0)  such  that  each  a,-  is  directly  linked  to  Ut+i  via  tt,  for  0  <  i  <  n  -  1. 

Definitions.  A  braid  of  type  A  over  A^,B^  is  a  quintuple  1C  = 
{a,C,D,irA,TrB),  where 

1.  a  €  A  \  Oa, 

2.  C  is  a  finite  subset  of  A  containing  a.  All  elements  of  C\{a}  are  non-atomic. 
jD  is  a  finite  set  of  non-atomic  elements  of  B, 

3.  tza  :  OAiC)  ^  D  and  ttb  :  Ob{D)  ^  C  are  mappings.  For  (o,e)  €  tta  U  ttb, 
e  is  always  a  non- atomic  element, 

4.  each  element  in  C  U  B  is  a  7r-descendant  of  a,  for  tt  :=  tta  U  ttb- 

The  element  a  is  called  the  root  of  K,  The  elements  in  the  sets  C  and  D  are 
called  the  elements  of  1C  of  type  A  and  B  respectively.  The  functions  tta  and  ttb 
are  called  the  linking  functions  of  1C  of  type  A  and  B  respectively. 

Braids  of  type  B,  with  root  in  B  \  Ob,  are  defined  symmetrically.  A  braid 
1C  is  called  trivial  if  the  root  of  /C  is  a  bottom  atom  z  S  Z,  In.  this  case,  z  is 
the  only  element  of  the  braid.  It  does  not  make  sense  to  distinguish  between  the 
trivial  braid  (z,  {z},  0, 0, 0}  of  type  A  and  the  trivial  braid  (z,  0,  {z},  0, 0)  of  type 
B.  W^e  identify  both  braids.  Hence,  trivial  braids  have  mixed  type. 
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Exam.ple2.  The  following  figure  represents  a  braid  over  two  termalgebras,  for 
signatures  U  =  {f,a}  and  A  =  {g}  respectively. 


We  sometimes  write  Oj\{K)  and  for  0a{0)  and  Ob{D)  respectively,  and 

0{K:)  denotes  the  union  (Pa(^) U(9b(^).  A  quintuple  K  =  (a, C, D, that 
satisfies  Conditions  1-3  of  Definition  9  will  be  called  a  prebraid. 

Definition  10.  Let  K  =  {a,C,D,  tta,  ttb)  be  a  braid.  Then  the  braid  K'  := 
{a\C\B'^7r'^,n*jg)  (of  type  A  or  B)  is  a  subbraid  of  /C  if  a'  €  C  U  D,  C"  C  C, 
D'  C  D,  TT^  C  tta,  and  C  ttb- 

Sub  (pre)  braids  of  prebraids  are  defined  in  the  same  way. 

Lemma  11.  For  each  element  e  of  a  prebraid  K,  there  exists  a  unique  subbraid 
of  K  with  root  e. 

The  concrete  open  atoms  that  are  used  to  organize  links  between  elements 
of  distinct  type  in  a  given  braid  should  be  regarded  as  irrelevant.  The  following, 
purely  algebraic  notions  are  used  to  formalize  this  idea.  An  endomorphism  m  G 
M  (n  6  AT)  is  called  admissible  if  m  (n)  leaves  all  bottom  atoms  z  S  Z  fixed  and 
if  m(o)  G  Oa  (^(o)  €  Ob)  for  all  o  G  Oa  (o  G  Automorphisms  are  called 

admissible  if  they  define  a  permutation  of  the  set  of  open  atoms  while  leaving 
bottom  atoms  fixed,  A  pair  (m,  n)  G  ./VI  x  A/*  is  called  admissible  if  both  m  and 
n  are  admissible. 

Definition  12.  Let  /C  =  (a,  C,D,7rA,7rB>  and  Kf  =  {a',  C7',D',7r;i,?r^)  be  two 
prebraids,  say,  of  type  A,  Kf  is  ceJled  a  variant  of  K  if  there  exists  an  admissible 
pair  of  automorphisms  (m,n)  such  that  a'  =  m(a),  C  -  {m(c)  |  c  G  C},  D*  = 
{n(d)  I  d  e  r>},  :=  {(m(o),n(d))  |  {o,d)  e  ha},  and  ;=  {(n(o),m(c))  | 

{o,c}enB}. 

Two  (pre)braids  that  are  variants  of  each  other  are  meant  to  denote  the  same 
object.  But  then  we  should  not  distinguish  between  two  subbraids  of  one  and 
the  same  (pre)braid  if  they  are  variants.  In  order  to  identify  such  subbraids,  we 
use  admissible  pairs  of  endomorphisms  of  a  particular  type. 

^  Intuitively,  admissible  endomorphisms  cause  a  “renaming”  of  open  atoms,  compare 
Lemma  6.  They  may  identify  distinct  open  atoms. 
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Definition  13.  The  admissible  pair  (m,n)  is  a  simplifier  for  the  prebraid  )C  = 
(a,C,D,7tA^T^B)  if  the  following  conditions  hold: 

-  Voi,02  €  OAiC):  m{oi)  =  m{o2)  implies  n(7rx(oi))  =  n(7rA(o2)), 

-  Voi,02  €  OsiD):  n(oi)  =  n(o2)  implies  m{nB{oi))  =  m(7rj3(o2)). 

Definition  14.  Let  (m,  n)  be  a  simplifier  for  the  prebraid  /C  =  (a,  C,  D,  tta,  '^b}- 
The  image  of  JC  with  respect  to  (m,  n)  is  the  prebraid  := 

(a^C',D',7r^,7r^)  with  the  following  components:^ 

1.  a'  :=  m(a), 

2.  C'  :=  {m(c)  |  c  €  C}  and 
D'  :=  {n{d)  \deD}, 

3.  TT^  :=  {{m(o),n(d))  |  (o,d)  €  7r^,m(o)  G  (9a(C')},  and 

:=  {(n(o),m(c))  |  (o,c)  €  nB,n{o)  G  (9B(i^')}- 

Now  assume  that  /C  is  a  braid.  The  braid-image  of  K  with  respect  to  (m,  n), 
is  the  unique  subbraid  of  with  root  a' fi 

Example  3.  Here  is  the  braid-image  of  the  braid  from  Example  2  under  the  sim¬ 
plification  (m,n)  where  m  maps  03  to  02  and  n  maps  U3  to  U2: 


Lemma  15.  Let  (m,n)  be  a  simplifier  for  the  prebraid  JC.  If  the  restrictions  ofm 
and  n  on  Oa{^)  o>nd  Ob(^)  respectively  are  injective,  then  is  a  variant 

ofJC. 

Call  a  simplifier  (m,n)  for  JC  strict  if  the  restriction  of  m  on  Oa{JC)  or  the 
restriction  of  n  on  Ob{JC)  is  not  injective.  A  prebraid  JC  is  called  irreducible  if 
JC*  does  not  have  a  strict  simplifier. 

®  Using  Lemma  6  ajid  the  fact  that  both  and  axe  non-collapsing  it  is  trivial  to 
verify  that  is  a  prebraid. 

®  We  would  like  to  mention  here  one  technical  point  behind  the  definition  of  a  simpli¬ 
fier.  The  set 

({m(o)  I  o  €  Oa(C)}  \  Oa(C*))  U  ({n(o)  |  o  G  Ob(D))  \  Ob{D'))  is  called  the  set 
of  pending  atoms  of  the  simplification  step  leading  from  JC  to  Pending  atoms 

may  in  fact  occur  (compare  the  inclusion  mentioned  in  Lemma  6).  This  phenomenon 
can  be  used  to  show  that  image  and  braid-image  may  really  be  different,  and  it  is  the 
source  of  many  technical  problems  for  the  mathematical  treatment  of  simplification. 
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Lemma  16.  (a)  If  the  prebraid  K  =  (a,  C,  tt^)  is  irreduciblej  then  tt^  aiirf 
TT^  are  injective. 

(b)  If  1C'  is  a  subbraid  of  the  irreducible  prebraid  K,  then  K'  is  irreducible. 

(c)  If  K\  and  IC2  are  subbraids  of  the  irreducible  prebraid  JC,  and  if  Ki  and  JC2 
are  variants^  then  fCi  =  /C2‘ 

The  following  theorem  represents  the  main  result  about  simplification  of  braids  J 

Theorem  17.  Let  fC  =  JCq^JCi,...  ,)Ck  be  a  sequence  of  braids  such  that  each 
braid  JCi+i  is  the  braid-image  of  /C,*  under  a  strict  simplification,  for  i  = 
0, —  1.  Then  k  <  |C>(/C)|.  If  K'  is  an  irreducible  braid  that  is  reached 
from  K,  by  a  sequence  of  consecutive  simplification  steps  (always  taking  braid 
images),  then  there  exists  a  simplifier  (m,n)  for  1C  such  that  =  K' .  If 

two  irreducible  braids  )Ci  and  IC2  can  be  reached  from  K,  by  sequences  of  con¬ 
secutive  simplification  steps  (always  taking  braid-images),  then  ICi  and  IC2  are 
variants. 

On  the  basis  of  Theorem  17  it  is  simple  to  see  that  we  may  introduce  the  following 
equh^ence  relation  on  the  set  of  ail  braids. 

Definition  18.  Two  braids  are  called  equivalent  if  they  can  be  simplified  to  the 
same  irreducible  braid  image.  If  /C  is  a  braid,  [1C\  denotes  the  set  of  all  braids 
that  are  equivalent  to  1C. 

Standard  normalization 

In  order  to  define  the  underlying  domain  of  the  rational  amalgam  we  shall  now 
introduce  a  standard  normal  form  for  each  braid.  Let  0\  be  a  subset  of  the 
set  Oa  of  open  atoms  of  that  has  the  same  cardinality  as  the  set  of  all 
equivalence  classes  of  non-trivial®  braids  of  t3q)e  B.  Similarly,  let  Oq  be  a  subset 
of  the  set  Ob  of  open  atoms  of  that  has  the  same  cardinality  as  the  set  of 
all  equivalence  classes  of  non- trivial  braids  of  type  A.  Let  := 
and  let  :=  SH^{Z  U  Lemma  10  of  [BS95]  shows 

Lemma  19.  Every  bijection  between  Z  U  and  Z  U  Oa  extends  to  a  S- 
isomorphism  between  A^  and  A^ .  Similarly  every  bijection  between  Z  U  0*q 
and  Z  U  Ob  extends  to  a  A-isomorphism  between  and  B^. 

We  may  now  enumerate  the  elements  of  0\  and  of  Oq  in  the  form 

0*A  =  {o[/cj  I  /C  is  a  nontri\d2d  braid  of  type  B}, 

0*Q  =  {o^K]  I  ^  is  a  nontrivial  braid  of  type  ^4}. 

This  means  that  [/C]  o^k:]  establishes  a  bijection  between  the  set  of  all  equi¬ 
valence  classes  of  non-trivial  braids  of  type  A  (B)  and  O’^  {Pa)-  ^  “ 

The  proof  given  in  [KS96]  is  very  technical,  a  corresponding  result  for  simplification 
of  prebraids  is  proved  first. 

®  compare  Definition  9. 
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(a,  C,  D,  TTx,  ttb)  be  a  prebraid.  For  each  open  atom  o  €  Oa{C)  {o  €  Ob{D))  we 
say  that  o  points  in  1C  to  K’  iff  KJ  is  the  unique  subbraid  of  /C  wdth  root  ^^{0) 
{^b{o))\ 

Definition  20.  An  irreducible  prebraid  /C  is  in  standard  normal  form  if  (9^(/C)U 
Ob{IC)  C  U  Oq  and  if  every  open  atom  o  €  Oa{IC)  U  Ob{IC)  points  in  JC  to 
a  subbraid  JC'  such  that  o  =  o^/c'] . 

With  A  0  B  w^e  denote  the  set  of  all  braids  over  and  in  standard  normal 
form.  Note  that  trivial  braids  are  always  in  standard  normsd  form. 

Lemma  21.  Let  JC  he  a  prebraid.  Let  (m,  n)  denote  the  admissible  pair  of  endo- 
morphisms  that  maps  each  o  E  Oa(JC)  U  Ob{JC)  to  where  JC  is  the  unique 
braid  such  that  o  points  in  JC  to  C .  Then  (m,n)  is  a  simplifier  for  JC  and 
is  in  standard  normal  form. 

Definition  22.  The  process  where  we  apply  to  a  given  (pre)braid  JC  the  simpli¬ 
fier  (m,n)  that  maps  each  open  atom  o  E  0{JC),  pointing  in  /C  to  the  subbraid 
JC,  to  the  open  atom  0[)c>]  E  O'X  U  will  be  called  standard  simplification  of 
JC,  The  prebraid  (the  braid  will  be  called  the  standard  (braid) 

normal  form  of  JC, 

Obviously  all  subbraids  of  a  prebraid  in  standard  normal  form  are  again  in 
standard  normal  form. 

Lemma  23.  For  each  (pre)braid  JC  there  exists  exactly  one  (pre)braid  C  in 
standard  normal  form  such  that  JC  and  C  are  equivalent. 

Lemma  24.  Given  e  E  (A*  U  B*)  \  {0\  U  Oq)  there  exists  a  unique  braid  JC  E 
A  0  B  such  that  e  is  the  root  of  JC. 

5  The  rational  amalgamated  product 

Given  the  underlying  domain  of  the  rational  amalgam  of  and  as  con¬ 
structed  above,  there  is  now  a  perfectly  natural  way  to  introduce  functions  and 
relations  that  interpret  the  symbols  of  the  mixed  signature  A.  Consider  the 
two  functions  root  a  :  A  0  B  — ^  A*  and  roots  '  AE)  B 

{the  root  of  /C  if  /C  is  trivial  or  has  type  A 
0[/c]  E  if  JC  is  non-trivial  and  has  tj'pe  B. 

{the  root  of  ^  if  /C  is  trivial  or  has  type  B 
0[K]  ^  ^*b  ^  non-trivial  and  has  type  A. 

As  a  direct  consequence  of  Lemma  24  we  obtain 

Lemma  25.  The  functions  root^  and  roots  bijections. 


rootA{JC)  := 
roots(/C)  := 


compare  Lemma  11. 
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Here  is  now  the  definition  of  the  rational  amalgamated  product. 

Definition  26.  The  rational  amalgamated  product  O  of  A^  and  is 
the  following  {E  U  ^)-structure  with  carrier  AQ)B: 

1.  Let  /  G  X*  be  an  n-ary  function  symbol,  let  ^i, . . . ,  /C,,  G  A  G  B.  We  define 
fAQB(ICi,.  ..,IC„)  =  root^^(/^.(root^(ACi),  ■  •  •  ,rootyi{fC„))). 

2.  Let  p  G  X  be  an  n-ary  predicate  symbol,  let  , . . . ,  /C„  G  ^  ©  X.  We  define 
A^  QB^  \=p{}Ci,...,ICn)  iS  Af  |=p(root^(A:i),...,rootA(^n)). 

The  interpretation  of  the  function  symbols  g  e  A  and  the  predicate  symbols 
g  G  ^  in  A^  0  is  defined  symmetrically,  using  roots. 

Theorem  27.  As  a  E-structure^  A^  0  B^^^A^  and  A^  are  isomorphic,  and 
rootA  :  A^  0  B^  A^  is  a  E -isomorphism.  As  a  A-structure,  A^  0  B^,B^, 
and  B^  are  isomorphic,  and  roots  '  A^  ©  B"^  B^  is  a  A -isomorphism. 

Let  us  add  some  e^'idence  for  the  naturalness  of  rational  amalgamation.  First 
we  consider  the  case  where  the  two  components  are  strong  non- collapsing  SC- 
structures  over  disjoint  signatures.  This  is  the  situation  where  we  can  build  both 
the  free  amalgam  and  the  rational  amalgam  with  our  actual  methods. 

Theorem  28.  The  free  amalgamated  product  is-modulo  isomorphism — a  sub¬ 
structure  of  the  rational  amalgamated  product. 

The  result  shows  that  there  are  interesting  relationships  between  distinct  amal¬ 
gamation  constructions. 

Theorem  29.  The  rational  amalgamated  product  of  two  algebras  of  rational 
trees  over  disjoint  signatures  is  isomorphic  to  the  algebra  of  rational  trees  over 
the  combined  signature. 

This  shows  that  our  general  construction,  complicated  as  it  might  appear,  yields 
the  expected  result  when  we  consider  more  concrete  situations. 

6  Combination  of  Constraint  Solvers 

Our  last  aim  is  to  show  how  constraint  solvers  for  two  component  structures 
can  be  combined  to  a  constraint  solver  for  their  rational  amalgamated  product. 
Constrcdnt  solvers,  as  considered  here,  are  essentially  algorithms  that  decide 
solvability  of  quantifier-free  positive  formulae  in  a  given  solution  domain.  We 
(mostly)  disregard  disjunction  since  its  integration  is  a  triviality. 

Definition  30.  Let  X  be  a  signature.  A  F -constraint  is  a  conjunction  of  atomic 
X-formulae. 

In  order  to  decide  solvability  of  a  “mixed”  (X  U  4)-constraint  in  a  rational 
amalgamated  product  A^  0  B^  we  shall  decompose  it  into  two  pure  constraints 
over  the  signatures  X  and  A  respectively.  These  output  constraints  are  equipped 
with  additional  restrictions  of  a  particular  type: 
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Definition  31.  An  A/N  (atom/non- atom)  declaration  for  a  constraint  7  is  a 
pair  {U,W)  such  that  C/  l±l  C  Var(7)  is  a  disjoint  union.  Both  U  and  W  may 
be  empty.  A  solution  Va  of  a  constraint  7  in  an  SC-structure  (A^,M,X)  is 
called  a  solution  of  (7,  C7,  W)  if  ua  assigns  distinct  atoms  to  the  variables  in  I/, 
and  arbitrary  non-atomic  elements  of  A  to  the  Vciriables  in  W. 

In  order  to  avoid  some  ballast  in  proofs  we  shall  assume  that  at  least  one  of 
the  two  components  is  a  non- trivial  SC-structure,  w^hich  means  that  it  has  at 
least  one  non-atomic  element.  We  may  now  formulate  our  main  result  concerning 
combination  of  constraint  solvers  in  the  case  of  rational  amalgamation. 

Theorem  32.  Let  AP  and  be  two  non- collapsing  SC-structures  over  disjoint 
signatures^  let  denote  their  rational  amalgam.  Assume  that  at  least  one 

of  the  two  components  is  a  non-trivial  SC-structure.  Then  solvability  of  {EU  A) - 
constraints  in  A^  Q  B^  is  decidable  if  solvability  of  (E-  resp.  A-)  constraints 
with  A/N  declarations  is  decidable  for  A^  and  B^. 

There  seems  to  be  no  general  way  to  characterize  solvability  of  T-constraints 
with  A/N  declarations  in  purely  logical  terms.  But  for  a  restricted  class  of  compo¬ 
nent  structures — a  class  which  is  of  particular  interest  in  the  context  of  rational 
amalgamation — a  logical  characterization  of  the  problems  that  we  have  to  solve 
in  the  two  component  structures  can  be  given. 

Definition  33.  A  non-collapsing  SC-structure  {A^,M,X)  is  called  rational  if 
for  every  atom  x  E  X  and  every  element  a  €  A  there  exists  an  endomorphism 
m  e  M  that  leaves  all  atoms  x'  x  fixed  such  that  m{x)  =  m(a).^° 

The  algebra  of  rational  trees  over  a  given  signature  is  always  a  rational  SC- 
structure.  The  same  holds  for  feature  structures  ([AP94]),  feature  structures 
with  arity,  and  domains  with  nested,  rational  lists.  For  rational  SC-structures 
we  obtain  the  following  extension  of  Theorem  32. 

Theorem  34.  Let  A^  and  B^  be  two  non-trivial  rational  SC-structures  over 
disjoint  signatures^  letA^GB^  denote  their  rational  amalgam.  Then  the  positive 
existential  theory  of  A^  O  B^  is  decidable  if  the  positive  universal- existential 
theory  is  decidable  for  both  components  A^  and  B^. 

It  is  interesting  to  contrast  this  formulation  with  the  corresponding  combination 
result  for  free  amalgamation  (Theorem  22  of  [BS95])  which  needs  stronger  as¬ 
sumptions  on  the  components:  Let  A^  and  B^  he  two  strong  SC-structures  over 
disjoint  signatures^  let  A^  B^  denote  their  free  amalgam.  Then  the  positive 
existential  theory  of  A^  ®B^  is  decidable  if  the  full  positive  theory  is  decidable 
for  both  components  A^  and  B^. 

The  existence  of  such  an  endomorphism  is  trivial  if  x  ^  Stab‘^{a).  In  this  case  we 
may  always  take  the  endomorphism  m  =  m*_o  of  Af  that  maps  x  to  a  and  leaves 
all  other  atoms  fixed.  The  situation  of  interest  is  the  case  where  x  €  Stabj^i  (a)  and 
X  ^  a. 
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Corollary 35.  Rational  amalgamated  products  O-**  O  *  have  decida¬ 
ble  positive  existential  theory  if  the  nontrivial  components  Af*  are  rational  tree 
algebras,  or  nested,  rational  lists,  or  feature  structures'^,  or  feature- structures 
with  arity,  for  i  =  1,. . .  ,k,  and  if  the  signatures  of  the  components  are  pairwise 
disjoint. 

In  the  rest  of  this  section  we  describe  the  underlying  algorithm  that  yields  the 
basis  for  Theorem  32  and  is  used  to  combine  given  constraint  solvers  for  two 
components  to  a  constraint  solver  for  the  rational  amalgam.  The  algorithm  re¬ 
duces  a  mixed  constraint  7  in  the  signature  (JCU  non-deterministically  to  a 
pair  of  constraints  with  A/N  declarations  over  the  “pure”  signatures  H  and  A 
respectively.  For  simplicity  we  assume  that  the  input  formula  7  has  the  form 
7  =  7^  A  7o^  w-here  7^  is  a  conjunction  of  atomic  Z'-formulae,  and  7^  is  a  con¬ 
junction  of  atomic  -A-formulae.  Moreover  we  assume  that  7  does  not  contain  any 
equation  between  variables.  These  assmnptions  do  not  recilly  restrict  the  genera¬ 
lity  of  the  approach:  simple  techniques  like  “variable  abstraction” ,  now  standard 
in  this  area,  may  be  used  to  transform  an  arbitrary  (X"  U  zl)-constraint  (p  into  a 
constraint  7  of  the  form  given  above,  preserving  solvability  in  both  directions. 

Algorithm  1  The  input  is  a  mixed  constraint  7  =  7^  A 7^  of  the  form  described 
above.  Let  Vq  =  Var(7^)  H  Var(7^)  denote  the  set  of  shared  variables  of  7.  The 
algorithm  has  two  steps,  both  are  nondeterministic. 

Step  1:  Vsiriable  identification.  Consider  all  possible  partitions  of  the  set  of 
all  shared  variables,  Vq  .  Each  of  these  partitions  yields  one  of  the  new  constraints 
as  follows.  The  variables  in  each  class  of  the  partition  are  Hdentified”  with  each 
other  by  choosing  an  element  of  the  class  as  representative,  and  replacing  in  the 
input  formula  all  occurrences  of  variables  of  the  class  by  this  representative. 

Step  2:  Choose  signature  labels.  Let  7^  A  7^^  denote  one  of  the  formulae 
obtained  by  Step  1,  let  Vi  denote  the  set  of  representants  of  shared  variables.  The 
set  Vi  is  partitioned  in  two  subsets  U  and  W  in  some  arbitrary  way. 

Let  a  =  7^ ,  let  S  =  jf.  For  each  of  the  choices  made  in  Step  1  and  2,  the  algo¬ 
rithm  yields  an  output  pair  {{o',U,W},{S,W,U}),  each  component  representing 
a  constraint  with  A/N  declaration.  □ 

Ignoring  several  details.  Algorithm  1  can  be  obtained  from  the  corresponding 
algorithm  for  free  amalgamation  by  omitting  one  non-deterministic  step  (namely 
the  choice  of  a  linear  ordering  on  the  set  of  shared  variables).  This  shows  that 
Algorithm  1  is  more  efficient  than  the  combination  scheme  for  free  amalgama¬ 
tion.  The  proof  of  Theorem  32  is  based  on  the  following  proposition  which  is 
verified  in  [KS96]. 

Proposition  36.  The  input  formula  7  has  a  solution  in  A^  O  if  and  only 
if  there  exists  an  output  pair  {{o',U,W),{S,W,U))  of  Algorithm  1  such  that 
{a,  U,  W)  has  a  solution  in  A^  and  {S,  W,  U)  has  a  solution  in 


As  in  Examples  1  we  refer  to  [AP94],  for  specificity. 
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7  Conclusion 


In  this  paper  we  have  introduced  rational  amalgamation,  a  general  methodology^ 
for  combining  constraint  systems.  The  present  work,  in  connection  wdth  the 
discussion  of  free  amalgamation  in  [BS95],  seems  to  suggest  a  new^  view  of  the 
problem  of  combining  solution  domzdns  and  constraint  solvers.  There  is  now 
strong  evidence  that  the  situation  considered  in  [BS95]  and  in  this  paper — 
the  construction  of  “mixed”  elements  of  a  combined  domain,  given  the  “pure” 
elements  of  two  component  structures  as  construction  units — is  quite  similar  to 
the  process  of  building  the  elements  of  a  single  structure,  given  the  symbols  of 
a  fixed  signature  as  construction  units.  We  are  confident  that  this  analogy  will 
help  to  isolate  the  most  important  methods  for  combining  structures,  and  to 
understand  the  relationship  and  the  differences  between  different  amalgamation 
constructions. 

When  we  compose  elements,  given  the  symbols  of  a  fixed  signature,  three 
different  structures  may  be  obtziined  in  a  direct  way,  depending  on  the  compo¬ 
sition  principle,  ncimely  the  free  term  algebra,  the  algebra  of  rational  trees,  and 
the  algebra  of  infinite  trees.  The  privileged  role  of  these  three  algebras,  and  the 
rich  amount  of  interesting  relationships  between  them,  are  now  well-understood 
(e.g.,  [Co83,  Ma88]).  We  believe  that  free  amalgamation,  rational  amalgamation 
2Lnd  a  further  construction  called  “infinite  amalgcimation”  (still  to  be  investiga¬ 
ted)  reflect  this  role  on  the  higher  level  of  amalgamation  constructions.  Many 
of  the  results  that  we  have  obtained  for  free  and  rational  amalgamation  can  be 
interpreted  in  this  sense: 

—  The  universality-property  of  the  free  amalgamated  product  (see  [BS95])  re¬ 
flects  the  status  of  the  free  term  algebra  as  the  absolutely  free  17-algebra. 

—  We  have  seen  that  the  free  amalgamated  product  is  always  a  substructure  of 
the  rational  amalgamated  product.  This  reflects  the  fact  that  the  free  term 
algebra  is  always  a  substructure  of  the  algebra  of  rational  trees. 

—  It  is  well-know'n  that  the  imification  algorithm  for  the  algebra  of  rational 
trees  can  be  considered  as  the  variant  of  the  unification  algorithm  for  the 
free  term  algebra  where  we  omit  the  occur- check.  Similarly,  the  decompo¬ 
sition  scheme  for  rational  amalgamation  as  given  here  (i.e.,  Algorithm  1)  is 
essentially  the  decomposition  scheme  for  free  amalgamation  w^here  we  omit 
the  “interstructural”  occur-check  that  is  provided  by  the  choice  of  a  linear 
ordering  in  the  latter  scheme. 

We  would  not  be  surprised  if  much  more  principles,  techniques  and  theorems, 
well-known  on  the  level  of  tree  constructions,  could  be  lifted  to  the  level  of  com¬ 
bining  structures.  Our  experience  with  rational  amalgamation  seems  to  indicate 
that  this  is  a  difficult,  but  promising  line  of  research  if  we  want  to  understand 
the  scale  of  possibilities,  and  the  limitations  for  combining  solution  domains  and 
constraint  solvers. 
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Abstract.  We  study  the  problems  of  deciding  consistency  and  perform¬ 
ing  variable  elimination  for  disjunctions  of  linear  inequalities  and  in¬ 
equations  with  at  most  one  inequality  per  disjunction.  This  new  class  of 
constraints  extends  the  class  of  generalized  linear  constraints  originally 
studied  by  Lassez  and  McAloon.  We  show  that  deciding  consistency  of  a 
set  of  constraints  in  this  class  can  be  done  in  polynomial  time.  We  also 
present  a  variable  elimination  algorithm  which  is  similar  to  Fourier’s 
algorithm  for  linear  inequalities. 


1  Introduction 

Linear  constraints  over  the  reals  have  recently  been  studied  in  depth  by  re¬ 
searchers  in  constraint  logic  programming  (CLP)  and  constraint  databases  (CDB) 
[JM94,  KKR95,  Kou94].  The  most  important  operations  in  CLP  and  CDB  sys¬ 
tems  is  deciding  consistency  of  a  set  of  constraints,  and  performing  variable 
elimination. 

Disjunctions  of  linear  constraints  over  the  reals  are  important  in  many  ap¬ 
plications  [JM94].  The  problem  of  deciding  consistency  for  an  arbitrary  set  of 
disjunctions  of  linear  constraints  is  NP-complete  [Son85].  It  is  therefore  inter¬ 
esting  to  discover  classes  of  disjunctions  of  linear  constraints  for  which  consis¬ 
tency  can  be  decided  in  PTIME.  In  [LM89a],  Lassez  and  McAloon  have  stud¬ 
ied  the  class  of  generalized  linear  constraints  which  includes  linear  inequali¬ 
ties  (e.g.,  2xi  +  3x2  ”  hx^  <  6)  and  disjunctions  of  linear  inequations^ 

2xi  -j-  3x2  —  4a?3  ^4  V  X2-\-  x^-\-  x^  ^1).  Among  other  things,  they  have  shown 
that  the  consistency  problem  for  this  class  can  be  solved  in  PTIME. 

[Kou92,  IvH93,  Imb93,  Imb94])  have  studied  the  problem  of  variable  elimi¬ 
nation  for  generalized  linear  constraints.  The  basic  algorithm  for  variable  elimi¬ 
nation  has  been  discovered  independently  in  [Kou92]  and  [Imb93],  but  [Kou92] 
has  used  the  result  only  in  the  context  of  temporal  constraints.  The  basic  al¬ 
gorithm  is  essentially  an  extension  of  Fourier’s  elimination  algorithm  [Sch86]  to 
deal  with  disjunctions  of  inequations.  If  iS  is  a  set  of  constraints,  let  l^l  denote 
its  cardinality.  Let  C  =  /UD„  be  a  set  of  generalized  linear  constraints,  where  I 


^  Some  people  prefer  the  term  disequations  [Imb94]. 


298 


is  a  set  of  inequalities  and  is  a  set  of  disjunctions  of  inequations.  If  we  elimi¬ 
nate  m  variables  from  C  using  the  basic  algorithm  proposed  by  Koubarakis  and 
Imbert  then  the  resulting  set  contains  0{\I\  )  inequalities  and  0{\Dn  \  |/|  ) 

disjunctions  of  inequations.  A  lot  of  these  constraints  are  redundant.  Imbert  has 
studied  this  problem  in  more  detail  and  presented  more  advanced  algorithms 
that  eliminate  redundant  constraints  [Imb93,  Imb94]. 

The  contributions  of  this  paper  are  the  following: 

-  We  extend  the  class  of  generalized  linear  constraints  to  include  disjunctions 
with  an  unlimited  number  of  inequations  and  at  most  one  inequality  per 
disjunction.  For  example: 

Sxi  xb  —  4a?3  <  7  V  2x1  +  32^2  —  42^3  ^  A  V  X2  A-  x^A-  x^ 

The  resulting  class  will  be  called  the  class  of  Horn  constraints  since  there 
seems  to  be  some  analogy  with  Horn  clauses.  We  show  that  deciding  consis¬ 
tency  can  still  be  done  in  PTIME  for  this  class  (Theorem  9). 

-  We  extend  the  basic  variable  elimination  algorithm  of  [Kou92,  Imb93]  for 
the  new  class  of  Horn  constraints. 

The  paper  is  organized  as  follows.  Section  2  introduces  the  basic  concepts 
needed  for  the  developments  of  this  paper.  Section  3  presents  the  algorithm  for 
deciding  consistency.  Section  4  presents  the  algorithm  for  variable  elimination. 
Section  5  discusses  future  work. 


2  Preliminaries 

We  consider  the  n-dimensional  Euclidean  space  7^” .  We  assume  that  the  readers 
are  familiar  with  linear  constraints  over  'RA' .  We  will  consider  linear  inequalities 
(e.g.  22?!  -h  32^2  -  52:3  <  6),  equations  (e.g.,  2x1  +  3x2  —  6x3  =  6)  and  inequations 
(e.g.,  2xi  A-  3x2  —  5x3  6).  If  5  is  a  set  of  constraints  then  the  solution  set  of  S 

will  be  denoted  by  Sol{S).  We  will  use  the  same  notation  for  the  solution  set  of 
a  single  constraint. 

Let  us  now  present  some  concepts  of  convex  geometry  [Sch86,  Gru67].  We 
will  take  the  definitions  of  these  concepts  from  [LM89a].  If  K  is  a  subspace  of 
the  n-dimensional  Euclidean  space  and  p  a  vector  in  77.”  then  the  translation 
p  A- V  is  called  an  affine  space.  The  intersection  of  all  affine  spaces  that  contain 
a  set  S  is  an  affine  space,  called  the  affine  closure  of  S  and  denoted  by  Aff{S). 
If  e  is  a  linear  equation  then  the  solutions  set  of  e  is  called  a  hyperplane.  In  77^ 
the  hyperplanes  are  the  planes.  In  77.^  the  hyperplanes  are  the  straight  lines. 
A  hyperplane  is  an  affine  space  and  every  affine  space  is  the  intersection  of  a 
finite  number  of  hyperplanes.  If  E  is  a  set  of  equalities  then  Sol{E)  is  an  affine 
space.  If  i  is  a  linear  inequality  then  the  solution  set  of  i  is  called  a  half-space. 
If  7  is  a  set  of  inequalities  then  Sol(I)  is  the  intersection  of  a  finite  number  of 
half-spaces,  and  is  called  a  polyhedral  set. 
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A  set  5  C  7^"  is  called  convex  if  the  line  segment  joining  any  pair  of  points 
in  S  is  included  in  S.  Affine  subspaces  of  IV^  are  convex.  Half-spaces  are  convex. 
Also,  polyhedral  sets  are  convex. 

Let  us  now  define  the  class  of  constraints  that  we  will  consider. 

Definition!,  A  Horn  consirainiis  a  disjunction  d\  V  ^2  V  •  •  ■  V  dn  where 
each  di,  i  =  1, . . . ,  n  is  a  weak  linear  inequality  or  a  linear  inequation,  and  the 
number  of  inequalities  among  di , . . . ,  does  not  exceed  one.  If  there  are  no 
inequalities  then  a  Horn  constraint  will  be  called  negative.  Otherwise  it  will  be 
called  positive. 

Example  1.  The  following  are  examples  of  Horn  constraints: 

Zxi  H-  a?5  -  Axs  <  7  V  2ici  -h  2,X2  -  Ax^  /  4  V  x^-^  x^A-  x^ 

Axi  +  oTs  3  V  53^2  —  A  ^  6 

The  first  constraint  is  positive  while  the  second  is  negative. 

According  to  the  above  definition  weak  inequalities  can  also  be  considered 
as  positive  Horn  constraints.  However,  with  the  exception  of  Section  4,  we  will 
usually  find  it  more  convenient  to  consider  inequalities  separately. 

Negative  Horn  constraints  have  been  considered  before  in  [LM89a,  LM89b, 
Kou92,  IvH93,  Imb93,  Imb94,  Kou95].  To  the  best  of  our  knowledge  positive 
Horn  constraints  have  not  been  considered  before.  If  d  is  a  positive  Horn  con¬ 
straint  then  d  =  -t(E  A  i)  where  is  a  conjunction  of  equations  and  i  is  an 
inequality.  We  will  often  use  this  notation  for  positive  Horn  constraints. 

We  do  not  need  to  introduce  strict  inequalities  in  the  above  definition.  A 
strict  inequality  like  iCi-|-a:2  +  a;3<5  can  be  equivalently  written  as  follows: 

+  2^2  +  <  5,  xi  -f  aj2  +  353  /  5 

Similarly,  a  constraint  3:14-352+353  <  5  V  where  (fi  is  a  disjunction  of  inequations 
is  equivalent  to  the  conjunction  of  the  following  constraints: 

35i  +  352  +  353  <  5  V  <t>,  35i  +  352  +  353  ^  5  V  (j) 

A  similar  observation  is  made  in  [BB95]  in  the  context  of  the  ORD-Horn  class 
of  temporal  constraints. 

If  d  is  a  negative  Horn  constraint  then  the  solution  set  of  d  is  Sol(d)  =  \ 

5o/(-id).  The  constraint  ->d  is  a  conjunction  of  equations  thus  Sol{~>d)  is  an  affine 
space.  If  -id  is  inconsistent  then  d  is  equivalent  to  true  (e.g.,  x  2  V  35  /  3).  In 
the  rest  of  the  paper  we  will  ignore  negative  Horn  constraints  that  are  equivalent 
to  true. 

If  d  is  a  positive  Horn  constraint  of  the  form  A  i)  then  Sol(d)  = 

\  ^^^(-id).  The  constraint  -id  is  a  conjunction  E  A  i  where  E  is  a  conjunc¬ 
tion  of  equations  and  i  is  a  strict  inequality.  If  £*  A  i  is  inconsistent  then  its 
corresponding  Horn  constraint  d  is  equivalent  to  true  (e.g.,  d  =  x  ^2  V  35<3). 
If  E  A  i  is  consistent  and  Sol{i)  C  Sol{E)  then  d  =  -lE",  so  d  is  actually  a 
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negative  Horn  constraint  (e.g.,  x  ^  2  V  V  x  >  ~  x  ^  2  V  y  ^  2). 

\i  E  /\  i  is  consistent  and  Sol{i)  g  Sol{E)  then  its  solution  set  will  be  called 
a  half  affine  space.  In  the  half  affine  spaces  are  half-lines  or  half-planes.  For 
example,  z  =  2  A  a?>0isa  half  plane.  In  the  rest  of  the  paper  we  will  ignore 
positive  Horn  constraints  equivalent  to  a  negative  Horn  constraint  or  true. 

3  Deciding  Consistency 

[LM89a]  showed  that  negative  Horn  constraints  can  be  treated  independently 
of  one  another  for  the  purposes  of  deciding  consistency.  The  following  is  one  of 
their  basic  results. 

Theorem  2.  Let  C  =  I  U  Dn  be  a  set  of  constraints  where  /  is  a  set  of  linear 
inequalities  and  Dn  is  a  set  of  negative  Horn  constraints.  Then  C  is  consistent 
if  and  only  if  I  is  consistent,  and  for  each  d  e  Dn  the  set  I  U  {d}  is  consistent. 

Whether  a  set  of  inequalities  is  consistent  or  not,  can  be  decided  in  PTIME 
using  Kachian’s  linear  programming  algorithm  [Sch86].  We  can  also  detect  in 
PTIME  whether  I  U  {d}  is  consistent  by  simply  running  Kachian’s  algorithm 
2n  times  to  decide  whether  I  implies  every  equality  e  in  the  conjunction  of  n 
equalities  "^d.  In  other  words,  deciding  consistency  in  the  presence  of  negative 
Horn  contraints  can  be  done  in  PTIME. ^ 

Is  it  possible  to  extend  this  result  to  the  case  of  positive  Horn  constraints?  In 
what  follows,  we  will  answer  this  question  affirmatively.  Let  us  start  by  pointing 
out  that  the  independence  property  of  negative  Horn  constraints  does  not  carry 
over  to  positive  ones. 

Example  2.  Let  I  =  {x  >  l,x  <  b,y  =  ^.  The  constraint  sets 

I U  =  3  A  >  1)}  and  I  U  {->(2/  =  3  A  ar  =  1)} 

are  consistent.  But  the  set  I  U  {-1(2/  =  3  A  a;  >  1),  ->{y  =  3  A  a;  =  1)}  is 
inconsistent. 

Fortunately,  there  is  still  enough  structure  available  in  our  problem  which 
we  can  exploit  to  come  up  with  a  PTIME  consistecy  checking  algorithm.  Let 
C  =  I  U  Dp  U  Dn  be  a  set  of  constraints  where  /  is  a  set  of  inequalities,  Dp  is 
set  of  positive  Horn  constraints,  and  Dn  is  a  set  of  negative  Horn  constraints. 
Intuitively,  the  solution  set  of  C  is  empty  only  if  the  polyhedral  set  defined 
by  I  is  covered  by  the  affine  spaces  and  half  affine  spaces  defined  by  the  Horn 
constraints. 

The  algorithm  Consistency  shown  in  Figure  1  proceeds  as  follows.  Initially, 
we  check  whether  I  is  consistent.  If  this  is  the  case,  then  we  proceed  to  examine 
whether  Sol[I)  can  be  covered  by  Sol{{-id  :  d  £  Dp  {J  Dn})-  To  verify  this,  we 

^  The  exact  algorithm  that  Lassez  and  McAloon  give  in  [LM89a]  is  different  but  this 
is  not  significant  for  the  purposes  of  this  paper. 
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Algorithm  Consistency 

Input:  A  set  of  constraints  (7  =  /  U  Dp  U  Dn- 

Output:  "consistent”  if  C  is  consistent.  Otherwise  “inconsistent”. 
Method: 

If  I  is  inconsistent  then  return  “inconsistent” 

Repeat 

Done  *—  true 

For  each  d  E  Dp  U  Dn  do 

lfd  =  ^(E  A  i)€  Dp  and  Sol(I)  C  Sol(E)  then 
7  f—  /  A 

If  I  is  inconsistent  then  return  “inconsistent” 

Done  false 
Remove  d  from  Dp 

Else  If  deDn  and  Sol{I)  C  Sol{^d)  then 
Return  “inconsistent” 

End  If 
End  For 
Until  Done 
Return  “consistent” 

Fig.  1.  Deciding  consistency  of  a  set  of  Horn  constraints 


make  successive  passes  over  DpUDn-  In  each  pass,  we  carry  out  two  checks.  The 
first  check  discovers  whether  there  is  any  positive  Horn  constraint  d  =  A  i) 
such  that  Sol{I)  is  included  in  the  affine  space  defined  by  E.  If  this  is  the  case 
then  d  is  discarded  and  7  is  updated  to  reflect  the  part  possibly  “cut  off”  by  d. 
The  resulting  solution  set  5o/(7)  is  still  a  polyhedral  set.  An  inconsistency  can 
arise  if  Sol{I)  is  reduced  to  0  by  successive  “cuts” .  In  each  pass  we  also  check 
whether  there  is  an  affine  space  (represented  by  the  negation  of  a  negative  Horn 
constraint)  which  covers  Sol{I).  In  this  case  there  is  an  inconsistency  as  well. 
The  algorithm  stops  when  there  are  no  more  affine  spaces  or  half  affine  spaces 
that  pass  the  two  checks.  In  this  case  C  is  consistent. 

Let  us  now  prove  the  correctness  of  algorithm  CONSISTENCY.  First,  we  will 
need  a  few  technical  lemmas.  The  first  two  lemmas  show  that  the  sets  resulting 
from  successive  “cuts”  inflicted  on  5'o/(7)  by  positive  Horn  constraints  passing 
the  first  check  of  the  algorithm  are  indeed  polyhedral.  The  lemmas  also  give  a 
way  to  compute  the  inequalities  defining  these  sets. 

Lemma  3.  Let  7  be  a  set  of  inequalities  and  -^{E  A  i)  be  a  Horn  constraint 
such  that  5o/(7)  C  Sol{E),  Then  Sol{I  A  --(E  A  i))  =  50/(7  A  ^i). 

Proof.  Let  x  G  Sol{I  A  ^{E  A  i)).  Then  x  G  Sol{I)  and  x  G  Sol{-^{E  A  /)).  If 
X  G  Sol(-^{E  A  i))  then  x  G  Sol(^E)  or  x  G  Sol{-^i).  But  Sol(I)  n  Sol(pE)  =  0 
because  5o/(7)  C  Sol{E).  Therefore,  x  G  5o/(7)  and  x  e  Sol{~^i).  Equivalently, 
X  G  Sol{I  A  ->/)•  The  other  direction  of  the  proof  is  trivial. 
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Lemma  4.  Let  /  be  a  set  of  inequalities  and  dk  =  -^{Ek  A  4),  =  1, . , . ,  m  be 

a  set  of  Horn  constraints  such  that  Sol{I)  C  Sol{Ei)  and 

i 

Sot{I  A  /\ --h)  c  Sol{E,+^)  for  /  = 


Then 

^  m 

Sol(I  A  /\dk)  =  Sol{I  A  /\  -n4). 

*  =  1  A:  =  l 

Proof.  The  proof  is  by  induction  on  m.  The  base  case  m  =  1  follows  from  Lemma 
3.  For  the  inductive  step,  let  us  assume  that  the  lemma  holds  for  m  -  1  Horn 
constraints.  Then 


m  — 1 


Sol{I  A  /\  4)  =  Sol(I  A  /\  dk)nSol(dm)  = 


*=1 


Jb=l 


m— 1  m— 1 

Sol{I  A  f\  ■^ik)r\Sol{dm)  =  Sol({I  A  l\  -.4)  A  dm) 

^=1  *=1 

using  the  inductive  hypothesis. 

The  assumptions  of  this  lemma  and  Lemma  3  imply  that 

m  — 1  YYi 

Sol{{I  A  l\  -.4)  A  dm)=^Sol{I  A  l\  -14). 

*=1  k=l 

Thus 

m  m 

Sol(/  A  A  <^0  =  5o/(/  A  A  -4). 

k=l  Jfe=l 

The  following  lemmas  show  that  if  there  are  Horn  constraints  that  do  not 
pass  the  two  checks  of  algorithm  CONSISTENCY  then  the  aflane  spaces  or  half 
affine  spaces  corresponding  to  their  negations  cannot  cover  the  polyhedral  set 
defined  by  the  inequalities. 

Lemma  5.  Let  5  be  a  convex  set  of  dimension  d  and  suppose  that  Si,.  ..Sn  are 
convex  sets  of  dimension  di  <  d,  n.  Then  S  <g  ULi  ^i. 

Proof.  See  Lemma  2  of  [LM89a]. 

Lemma  6.  Let  7  be  a  consistent  set  of  inequalities  and  dk  =  -'{Ek  A  4),  k  = 
1, . . . ,  m  be  a  set  of  Horn  constraints  such  that  Sol(I)  g  Sol{Ek)  and  Soi(I)  fi 
So!{Ek)  ^  0  for  all  ^  1, . . . ,  m.  Then  Sol{I)  g  IJLi  Sol{^dk). 
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Proof,  The  proof  is  very  similar  to  the  proof  of  Theorem  1  of  [LM89a]. 

Since  Sol{I)  g  Sol{Ek)  then  Aff{Sol{I))  %  Sol{Ek).  This  means  that 
Sol{  Ek)f^Aff{Sol{I))  is  an  affine  space  of  strictly  lower  dimension  than  Aff{Sol{I)). 
Then  Sol{Ek)  O  Sol{I)  is  of  strictly  lower  dimension  than  Sol{I)  since  the 
dimension  of  Sol{I)  is  equal  to  that  of  Aff{Sol{I)),  Thus  from  Lemma  5, 
Sol{I)  g  Notice  now  that  Sol{-^dk)  Q  Sol{Ek)  for  all  k  = 

1, . . . ,  m.  Therefore  Ur=i  Sol{^dk)  C  Ur=i  Sol{Ek).  We  can  now  conclude  that 

soi{i)  g  ur=i 

The  following  theorems  demonstrate  that  the  algorithm  Consistency  is 
correct  and  can  be  implemented  in  PTIME. 

Theorem  7.  If  algorithm  CONSISTENCY  returns  “inconsistent”  then  its  input 
C  is  inconsistent. 

Proof  If  the  algorithm  returns  “inconsistent”  in  its  first  line  then  7,  and  there¬ 
fore  C,  is  inconsistent. 

If  the  algorithm  returns  “inconsistent”  in  the  third  if-statement  then  there 
are  positive  Horn  constraints  dk  =  -^{Ek  A  i*),  k  =  1, ...  ,m  <  \Dp\  such  that 
the  assumptions  of  Lemma  4  hold  for  I  and  di, . . . ,  dm •  Therefore 

m  m 

Sol{I  A  /\dk)  =  Sol{I  A  /\  ->4)  ==  0. 

fc  =  l  A:  =  l 

Consequently,  Sol{C)  =  0  because  Sol{C)  C  Sol{I  A  Afc=i 

If  the  algorithm  returns  “inconsistent”  in  the  fourth  if-statement  then  there 
are  positive  Horn  constraints  c?i, . . . ,  dn  €  Dp  and  negative  constraint  ^,7^4-1  G  Dn 
such  that  the  assumptions  of  Lemma  2  hold  for  I  and  di, ...,  dm,  and 

m 

Sol(^I  A  dk)  C  (So/(~'dj774-i ). 

kzzl 


But  then 

m+l  rn 

Sol(C)  C  Sol{I  A  /\  dm)  =  Solil  A  /\  cim)  n  Sol(dm+l)  =  0- 

&=1  k=l 

Theorems.  If  algorithm  Consistency  returns  “inconsistent”  then  its  input 
C  is  inconsistent. 

Proof.  If  the  algorithm  returns  “consistent”  then  I  is  consistent.  Let  di, . . . ,  dm 
be  the  positive  Horn  constraints  removed  from  Dp  U  Dn  by  the  algorithm,  and 
dm+i,  ...,dnhe  the  remaining  Horn  constraints.  Then 

n  m  IT’ 

Sol{C)=Sol{I  A  /\dk)  =  Sol{I  A  /\dk)\  U  Sol{-’dk)  = 

Jfe  =  l  *  =  1  A:=m+1 
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m  n 

Sol{I  A  /\  -4)  \  U  5o/(^4). 

k=l  k=m+l 

Notice  that  Sol{I  A  Ar=i  #  0  otherwise  the  algorithm  outputs  ‘‘incon¬ 
sistent”  in  Step  2.  Also,  Sol{I  A  ALi  -"4)  g  Sol(Ek)  for  alU  =  m  -f  1, . . . ,  n 
otherwise  the  algorithm  would  have  removed  from  Dp  U  Dn. 

Without  any  loss  of  generality  we  can  also  assume  that 

m 

Sol{I  A  A  -4)n5o/(£'ifc)#0 

kzzl 

for  all  A:  =  1, . . . ,  n  (if  this  does  not  hold  for  constraint  this  constraint  can 

be  discarded  without  changing  <So/((7)).  From  Lemma  6  we  can  now  conclude 
that  Sol{I  A  Ar=i  “'4)  g  Ufc=m+i  Sol(-^dk).  Therefore  Sol{C)  ^  0. 

Theorem  9.  The  algorithm  Consistency  can  be  implemented  in  PTIME. 

Proof.  It  is  not  difficult  to  see  that  the  algorithm  can  be  implemented  in  PTIME. 
The  consistency  of  I  can  be  checked  in  PTIME  using  Kachian’s  algorithm  for 
linear  programming  [Sch86].  The  test  Sol{I)  C  Sol{E)  can  be  verified  by  check¬ 
ing  whether  every  equation  e  in  the  conjunction  E  is  implied  by  1.  This  can  be 
done  in  PTIME  using  Kachian’s  algorithm  2n  times  where  n  is  the  number  of 
equations  in  E.  In  a  similar  way  one  can  implement  the  test  Sol{I)  C  Sol(-^d) 
in  PTIME  when  d  is  a  negative  Horn  constraint. 

4  Variable  Elimination 

In  this  section  we  study  the  problem  of  variable  elimination  for  sets  of  Horn  con¬ 
straints.  The  algorithm  VarElimination,  shown  in  Figure  2,  eliminates  a  given 
variable  x  from  a  set  of  Horn  constraints  C.  More  variables  can  be  eliminated  by 
successive  applications  of  VarElimination.  For  the  purposes  of  this  algorithm 
we  consider  inequalities  as  positive  Horn  constraints. 

The  algorithm  VarElimination  is  similar  to  the  one  studied  in  [Kou92, 
Imb93]  for  the  case  of  negative  Horn  constraints. 

Theorem  10.  The  algorithm  VarElimination  is  correct. 

Proof  Let  the  variables  of  C  be  X  =  {aj,  x^}.  If  (ar®,  x^,...,  g  7^” 

is  an  element  of  Sol{C)  then  it  can  be  easily  seen  that  it  is  also  an  element  of 
Sol(C'). 

Conversely,  take  (a;®,  • . . ,  )  6  72.”“^n5o/((7')  and  consider  the  set  C(x,  x®, .  • . ,  a:°). 

If  this  set  is  simplified  by  removing  constraints  equivalent  to  true,  disjunctions 
equivalent  to  false,  and  redundant  constraints  then 

C{x,  ,  a:“)  =  {?°  <  S  <  U®}  U  {a;  /  a^,  i  =  1, . . . ,  A}. 

Let  us  now  assume  (by  contradiction)  that  there  is  no  value  x^  6  PP  such  that 
(a?°,  a:2, . . . ,  x^)  G  Sol{C).  This  can  happen  only  under  the  following  cases: 
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Algorithm  VarElimination 

Input:  A  set  of  Horn  constraints  C  in  variables  A,  and  a  variable 
a;  G  A"  to  be  eliminated  from  C. 

Output:  A  set  of  Horn  constraints  C  in  variables  X  \  {x}  such  that 
Sol{C’)  ~  PTojeciionx\{x}{Sol{C)). 

Method: 

Rewrite  each  constraint  containing  x  SiS  x  <  U  V  ^  oi  L  <  x  V  <f)  or  x  A  V  (f> 
where  0  is  a  disjunction  of  inequations. 

For  each  pair  of  positive  Horn  constraints  x  <U  V  and  L  <  x  V  <f>2  do 
Add  L  <  U  y  <f>i  V  <f>2  io  C* 

End  For 

For  each  pair  of  positive  Horn  constraints  x  <U  V  0i  and  L  <x  V  (^2  do 
For  each  negative  Horn  constraint  x  ^  A  V  (j)  do 
Add  A:/:  L  V  A^U  V  <l>  to  C' 

End  For 
End  For 

Add  each  constraint  not  containing  x  to  C 
Return  C' 


Fig.  2.  A  variable  elimination  algorithm 


1.  ii°  <  If  inequalities  x  <  and  f  <x  come  from  positive  Horn  constraints 
X  <  u  y  (f)!  Siud  I  <  X  y  (1)2  then 

(l>i{xl,...,xl)  =  (t>i{xl,...,x^J  =  false 

otherwise  these  constraints  would  have  been  discarded  from  C{x,X2^ . . .  ^x^) 
during  its  simplification.  But  because  /  <  u  V  V  ^  C*'  and 

{xl,...,xl)e  Sol{C’)  then  Contradiction! 

2.  P  ~  =  Uj  5  1  i  <  With  reasoning  similar  to  the  above,  we  can  show 

that  this  case  is  also  impossible. 

Finally,  we  can  conclude  that  there  exists  a  value  x^  eH  such  that 
{x,xl.,.,xl)eSol{C). 

Let  (7  =  /  U  Dp  U  Lin  be  a  set  of  constraints.  Eliminating  m  variables  from 
C  with  repeated  applications  of  the  above  algorithm  will  result  i^  a  set  with 
0{{\I\-{-\Dp\y'^)  positive  Horn  constraints  and  0(lDnl(l/l  +  |Dp|)^  )  negative 

Horn  constraints.  Many  of  these  contraints  will  be  redundant;  it  is  therefore 
important  to  extend  this  work  with  efficient  redundancy  elimination  algorithms 
that  can  be  used  together  with  VarElimination. 
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5  Conclusions  and  Future  Work 

In  future  work  we  would  like  to  follow  the  steps  of  [Kou92,  Kou95]  and  consider 
the  applicability  of  the  results  of  this  paper  to  temporal  reasoning.  It  is  clear 
that  the  class  of  constraints  covered  here  is  more  expressive  than  the  quantitative 
temporal  constraints  of  [Kou92,  Kou95]  and  the  ORD-Horn  class  of  [BB95].  An 
open  question  is  whether  one  can  use  the  results  of  this  paper  to  find  more 
efficient  algorithms  for  the  ORD-Horn  class  of  qualitative  temporal  constraints^. 
In  [Kou95]  we  carried  out  successfully  a  similar  investigation  for  the  class  of  PA 
networks  [VKvB89]  using  the  results  of  [Kou92]. 

Another  interesting  problem  would  be  to  study  more  advanced  variable  elimi¬ 
nation  algorithms  for  Horn  constraints.  The  results  of  [Imb93,  Imb94]  that  apply 
to  negative  Horn  constraints  only,  should  be  a  good  starting  point  in  this  direc¬ 
tion. 
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Abstract.  Following  the  work  of  Wallace,  who  introduced  the  use  of  directed  arc- 
consistency  in  MAX-CSP  algorithms  using  DAC  counts,  we  present  a  number  of 
improvements  of  DAC  usage  for  the  P-EFC3  algorithm.  These  improvements  include:  (i)  a 
better  detection  of  dead-ends,  (ii)  a  more  effective  form  for  value  pruning,  and  (iii)  a 
different  heuristic  criterion  for  value  ordering.  Considering  the  new  DAC  usage,  we  have 
analyzed  some  static  variable  ordering  heuristics  previously  suggested,  and  we  propose 
new  ones  which  have  been  shown  effective.  The  benefits  of  our  proposal  has  been  assessed 
empirically  solving  random  CSP  instances,  showing  a  clear  performance  gain  with  respect 
to  previous  approaches. 


1  Introduction 

Constraint  satisfaction  problems  (CSP)  consider  the  assignment  of  values  to  variables 
under  a  set  of  constraints.  A  solution  is  a  total  assignment  satisfying  every 
constraint.  If  such  assignment  does  not  exist,  the  problem  is  overconstrained,  and  it 
may  be  of  interest  to  find  a  total  assignment  satisfying  as  many  constraints  as 
possible.  This  problem  is  called  the  maximal  constraint  satisfaction  problem  (MAX- 
CSP),  and  a  solution  is  a  total  assignment  satisfying  the  maximum  number  of 
constraints.  MAX-CSP  is  of  interest  in  several  areas  of  application  [Fox,  87; 
Feldman  and  Golumbic,  90;  Bakker  et  al,  93], 

P-EFC3  is  one  of  the  best  algorithms  for  MAX-CSP  [Freuder  and  Wallace,  92].  It 
is  a  branch  and  bound  algorithm  including  forward  checking  in  order  to  anticipate 
dead-ends  before  they  actually  occur.  To  detect  dead-ends,  P-EFC3  computes  a  lower 
bound  of  the  number  of  unsatisfied  constraints  from  (i)  the  set  of  past  (assigned) 
variables,  and  (ii)  the  effect  of  past  variables  on  future  (unassigned)  ones,  by  using 
inconsistency  counts  (IC).  Recently,  this  lower  bound  has  been  improved  by 
including  unsatisfied  constraints  from  the  set  of  future  variables.  This  has  been 
implemented  using  directed  arc-consistency  counts  (DAC)  [Wallace,  94],  and  the 
reported  empirical  results  show  a  clear  improvement  in  performance  with  respect  to 
pure  P-EFC3.  DAC  are  computed  in  a  preprocessing  step  following  a  variable 
ordering,  which  is  later  followed  by  P-EFC3  for  variable  instantiation. 

In  this  paper,  we  present  further  improvements  on  the  DAC  usage  inside  the  P- 
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EFC3  algorithm.  Our  proposal  considers  three  points:  (i)  a  better  dead-end  detection, 
by  increasing  the  lower  bound  associated  to  the  current  node,  (ii)  a  more  effective 
criterion  for  value  pruning,  and  (iii)  a  different  heuristic  criterion  for  value  ordering. 
All  these  modifications  are  simple  consequences  of  the  following  fact:  for  each 
feasible  value  of  a  future  variable,  its  associated  IC  and  DAC  can  be  added  producing  a 
lower  bound  of  the  number  of  inconsistencies  involving  that  value  if  it  is  eventually 
assigned  to  the  variable.  This  produces  a  better  lower  bound  for  branch  and  bound, 
which  causes  a  significant  improvement  in  the  algorithm  efficiency  compared  with 
the  previous  version  presented  in  [Wallace,  94].  In  addition,  given  that  P-EFC3  with 
DAC  follows  a  static  variable  ordering  (SVO),  we  have  analyzed  some  SVO 
heuristics  previously  reported,  as  well  as  new  ones.  We  have  assessed  their  relative 
efficiency  solving  random  MAX-CSP  instances. 

It  is  worth  noting  that  DAC  use  is  not  only  a  technical  improvement  on  the  P- 
EFC3  algorithm.  In  addition,  the  use  of  DAC  allowed  us  to  discover  an  easy-hard- 
easy  pattern  in  the  search  effort  of  solving  MAX-CSP  instances  with  constraint 
tighmess  as  varying  parameter  D^^arrosa  and  Meseguer,  96],  analogous  to  the  pattern 
observed  when  solving  CSP  instances  [Prosser,  94;  Smith,  94] .  This  phenomenon 
does  not  appear  when  solving  MAX-CSP  using  branch  and  bound  based  algorithms 
without  considering  some  degree  of  consistency  among  future  variables. 

This  paper  is  organized  as  follows.  In  Section  2  we  present  related  algorithms  for 
MAX-CSP.  In  Section  3  we  provide  some  preliminaries  and  definitions  required  in 
the  rest  of  the  paper.  In  Section  4,  we  analyze  the  use  of  DAC  in  [Wallace,  94],  and 
present  our  improvements.  In  Section  5,  we  discuss  several  SVO  heuristics.  Both 
DAC  improvements  and  heuristics  are  evaluated  in  Section  6,  where  we  provide 
empirical  results  solving  random  CSP  instances.  Finally,  Section  7  contains  the 
conclusions  of  this  work. 


2  Related  Work 

The  simplest  algorithm  for  MAX-CSP  follows  a  branch  and  bound  scheme.  This 
algorithm  performs  a  systematic  traversal  on  the  search  tree  generated  by  the  problem 
where  a  node  has  associated  a  set  of  assigned  variables  (called  past  variables)  and  a  set 
of  uninstantiated  (or  future)  variables.  At  each  node  one  future  variable  is  selected 
(current  variable)  and  all  its  feasible  values  are  considered  for  instantiation.  Associated 
with  each  node  there  is  a  cost  function,  defined  as  the  number  of  constraints  violated 
by  the  assigned  variables.  This  cost  function  is  called  the  distance  from  a  total 
assignment  satisfying  every  constraint.  Branch  and  bound  keeps  track  of  the  best 
solution  obtained  so  far,  which  is  the  total  assignment  with  minimum  distance  in  the 
explored  part  of  the  search  tree.  When  a  partial  assignment  has  a  distance  greater  than 
or  equal  to  the  distance  of  the  current  best  solution,  this  line  of  search  is  abandoned 
because  it  cannot  lead  to  a  better  solution  than  the  current  one  and  it  is  said  that  a  dead 
end  has  been  found.  The  distance  of  the  current  best  solution  is  used  as  an  upper 
bound  of  the  allowable  cost,  while  the  distance  of  the  current  partial  assignment  is  a 
lower  bound  of  the  cost  for  any  assignment  including  this  partial  assignment. 

The  basic  branch  and  bound  can  be  enhanced  with  more  sophisticated  strategies 
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based  on  previous  work  on  solvable  CSP.  Prospective  algorithms  look  ahead  to 
compute  some  form  of  local  consistency  between  past  and  future  variables.  The  most 
common  prospective  algorithm  is  forward  checking.  It  evaluates  the  impact  of  the 
current  partial  assignment  on  future  variables,  which  allows  the  algorithm  to  improve 
the  lower  bound.  Retrospective  algorithms  remember  previous  actions  in  order  to 
avoid  repeating  them  in  the  future  and,  in  this  way,  they  can  save  redundant  constraint 
checks.  Retrospective  algorithms  are  backjumping  and  backmarking.  All  these 
algorithms  are  complete.  For  a  detailed  description,  see  [Freuder  and  Wallace,  92]. 

Several  heuristics  for  variable  and  value  selection  have  been  developed.  Regarding 
SVO  heuristics,  variables  can  be  ordered  by  decreasing  width.  This  heuristic  is 
enhanced  when  combined  with  a  second  heuristic  to  break  ties,  such  as  minimum 
domain  size,  maximum  degree,  or  largest  mean  ACC  (arc  consistency  counts)  in  its 
domain.  These  combinations  are  called  conjunctive  width  heuristics  [Wallace  and 
Freuder,  93].  For  static  value  ordering,  values  are  ordered  by  increasing  ACC. 
Regarding  dynamic  variable  ordering  (DVO)  in  forward  checking,  variables  are  ordered 
either  by  the  largest  mean  of  inconsistency  counts  in  their  domains  or  by  minimum 
domain  size,  and  dynamic  value  ordering  considers  values  by  increasing  IC  [Freuder 
and  Wallace,  92].  Two  heuristics  for  dynamic  variable  and  value  ordering  are  given  in 
[Larrosa  and  Meseguer,  95],  based  on  gradients  of  a  local  consistency  ftmction. 

Finally,  the  notion  of  directed  arc-consistency  was  first  introduced  by  [Dechter  and 
Pearl,  88]  in  the  context  of  CSP.  The  use  of  DAC  in  MAX-CSP  as  a  method  to 
improve  the  computation  of  the  lower  bound  for  the  current  partial  assignment  is 
proposed  in  [Wallace,  94].  In  a  pre-processing  step,  DAC  are  computed  for  each  value 
following  a  fixed  variable  order.  This  order  must  remain  as  SVO  for  variable 
instantiation  in  the  forward  checking  algorithm.  DAC  counts  are  added  to  other  counts 
(distance  and  IC)  to  compute  a  better  lower  bound  for  the  current  partial  assignment. 

3  Preliminaries 

A  discrete  binary  CSP  is  defined  by  a  finite  set  of  variables  (X;  }  taking  values  on 
discrete  and  finite  domains  [Di]  under  a  set  of  binary  constraints  {Ry}.  A  constraint 
Rij  is  a  subset  of  Di  xDj,  containing  the  permitted  values  for  Xy  and  Xj.  The  number 
of  variables  is  n  and,  without  loss  of  generality,  we  will  assume  a  common  domain  D 
for  all  the  variables,  m  being  its  cardinality.  A  global  solution  of  the  CSP  is  an 
assignment  of  values  to  variables  satisfying  every  constraint.  If  no  solution  exists  the 
CSP  is  overconstrained;  in  this  case  we  are  interested  in  finding  solutions  satisfying  a 
maximum  number  of  constraints.  This  problem  is  usually  referred  as  MAX-CSP. 

P-EFC3  is  an  efficient  algorithm  to  solve  MAX-CSP  [Freuder  and  Wallace,  92] 
and  its  pseudo-code  appears  in  Figure  1.  P-EFC3  follows  the  branch  and  bound 
schema  enhanced  with  forward  checking.  It  keeps  for  all  feasible  values  of  future 
variables  the  number  of  inconsistencies  with  previous  assignments.  The  IC  associated 
with  value  I  of  variable  X^ ,  icn,  is  the  number  of  inconsistencies  that  value  I  of  X, 
has  with  the  assignments  of  past  variables.  Denoting  by  F  the  set  of  indexes  of  future 
variables,  the  sum  Yminf^{iCjk}  is  a  lower  bound  of  the  number  of  inconsistencies 
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that  will  occur  in  future  variables  if  the  current  partial  assignment  is  extended  into  a 
total  one.  Therefore,  this  term  can  be  included  to  compute  the  lower  bound  of  the 
current  partial  assignment  (lines  12  and  14).  In  addition,  a  value  /  of  a  future  variable 
Xi  can  be  pruned  if  the  distance  of  the  current  partial  assignment  plus  icn  is  not 
lower  than  the  best  distance  (lines  23  and  26).  Given  a  variable  ordering,  the  DAC 
associated  to  a  value  /  of  a  variable  Xi ,  dacn,  is  the  number  of  variables  which  are 
arc-inconsistent  with  value  I  for  Xi  and  appear  after  Xi  in  the  ordering  [Wallace,  94]. 
dacii  is  a  lower  bound  of  the  number  of  inconsistencies  that  Xi  will  have  with 
variables  after  Xi  in  the  ordering  if  I  is  assigned  to  Xi . 


procedure  P-EFC3 ( current_solution, distance. 

remaining_var iables , r emaining_domains ) 

1 

Xi  :=  select-next-variable (remaining_variables) ; 

2 

values  :=  sort-values (Xi,  remaining_doinains) ; 

3 

while  values  ^  0  do 

4 

1  :=  first (values )  ; 

5 

new_distance  :=  distance  +  icii; 

6 

if  ( remaining_variables  -  Xi  =  0)  then 

7 

if  (new_distance  <  best_distance)  then 

8 

best_distance  :=  new_distance ; 

9 

best_solution  :=  current_solut ion  +  (Xi,l); 

10 

endif 

11 

else 

12 

if  (new  distance  +  X  mink{ic-j]^}  <  best_distance) 

then 

jeP 

13 

new_remaining_domains  :=  look_ahead (domains, Xi ,  1 ) 

14 

if  (not  empty_domain  and 

(new  distance  +  T,  min]^{icj]^}  <  best_distance)  } 

then 

jTp 

15 

P-EFC3 (current_solution+ (Xi,  1)  , new_distance. 

remaining_variables-Xi , new_remaining_domains )  ; 

16 

endif 

17 

endif 

18 

endif 

19 

values  :=  values  -  1; 

20 

endwhile 

endprocedure 

function  look_ahead  (domains , ,  1 ) 

21 

forall  jeF  do 

22 

forall  keFeasibles  do 

23 

if  (new_distance  +  ic-j]^  >  best_distance)  then  prune  (Xj,k); 

24 

elsif  ( inconsistent (Xi , 1 , Xj , k} )  then 

25 

icjk  :=  icjK  +  1; 

26 

if  (new_distance+icj]^  >  best_distance)  then  prune  1 

:Xj,k) ; 

27 

endif 

28 

endif 

29 

endforall 

30 

endforall 

31 

return  updated_domains ; 

1  endfunction 

Fig.  1.  The  extended  forward  checking  algorithm. 
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4  Using  DAC  with  Forward  Checking 

The  usefulness  of  using  DAC  with  forward  checking  was  introduced  in  [Wallace,  94]. 
It  was  proposed  to  take  advantage  of  DAC  usage  inside  the  P-EFC3  algorithm  in 
three  ways: 


1 .  Anticipating  the  detection  of  dead-ends.  A  dead  end  occurs  when  it  is  known  that  a 
partial  assignment  cannot  be  extended  into  a  total  assignment  better  than  the  best 
solution  found  so  far.  This  is  done  by  comparing  the  lower  bound  of  the  partial 
assignment  with  the  distance  of  the  best  solution  found.  Wallace  proposed  the 
inclusion  of  inconsistencies  among  future  variables  in  the  lower  bound 
computation  [Wallace,  94],  using  the  following  expression, 


to  estimate  the  minimum  number  of  inconsistencies  that  will  occur  among  future 
variables.  This  term  can  be  added  to  the  expression  (lines  12  and  14), 


newjdistance  + 


increasing  the  lower  bound  of  the  current  node.  The  three  sununands  can  be  added 
because  they  refer  to  inconsistencies  produced  by  different  constraints: 
newjdistance  refers  to  violated  constraints  among  past  variables,  the  sum  of 
minimum  IC  refers  to  constraints  between  past  and  future  variables,  and  the  sum 
of  minimum  DAC  refers  to  constraints  among  future  variables.  Therefore  no 
constraint  is  considered  more  that  once.  Note  that  in  line  12,  it  is  also  possible  to 
add  the  term  dacn  to  the  lower  bound,  but  not  in  line  14.  This  is  because  dacu 
refers  to  constraints  between  Xi  and  future  variables,  and  line  12  — ^before 
look_ahead —  does  not  consider  that  information  in  its  sum  of  minimum  IC. 
However,  in  line  14  — after  lookjahead —  this  information  has  become  apparent  in 
the  IC  of  arc-inconsistent  variables.  Therefore,  if  dacn  had  been  included  in  line 
14,  it  would  have  been  counted  twice.  Lines  12  and  14  are  replaced  by, 

12  if  (new_dis tance+dacii+  {dac j]^}<best:_distance ) 

]6P  j€P 

14  if  (and  new_distance'f  ^minj^{icjk}+ ^min)^{dacj}^}<best_distance) 

jeP  jeP 


Note  that  the  new  lower  bound  is  greater  than  or  equal  to  the  previous,  so  this 
version  of  the  algorithm  cannot  visit  more  nodes  than  pure  P-EFC3. 

2.  Increasing  the  number  of  unfeasible  values.  A  feasible  value  k  of  a.  future  variable 
Xj  is  pruned  when  it  is  known  that  it  cannot  be  in  the  best  solution  (lines  23  and 
26).  The  sum  of  its  IC  plus  DAC  counts,  icj^  +  dacjk,  is  the  number  of 
inconsistencies  that  at  least  will  occur  if  value  k  is  eventually  assigned  to  Xj. 
Therefore,  lines  23  and  26  are  replaced  by, 

23  if  {new_distancG+dacix  +  iCjk+dacjk^est_distance)  then  prune  (Xj  ,  k) 
26  if  (new_distance+icj}^+dacjj^  >  best_distance)  then  prune  (Xj,k) 
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Again,  dacn  cannot  be  used  in  line  26  because  inconsistencies  produced  by  the 
constraint  between  X,  and  Xj  have  already  been  propagated  as  inconsistency  counts 
by  lookjahead. 

3.  Value  ordering.  The  order  in  which  feasible  values  of  a  variable  are  considered 
depends  on  a  heuristic  criterion.  The  information  contained  in  DAC  can  be  used  to 
order  values.  In  [Wallace,  94]  values  are  ordered  by  increasing  DAC  (or  ACC, 
when  they  are  computed). 

This  version  of  P-EFC3  using  DAC  was  proposed  in  [Wallace,  94],  and  we  will 
refer  to  it  as  P-EFC3+DAC1.  In  the  following  we  present  some  improvements  in  the 
use  of  DAC,  which  consider  again  the  three  points  addressed  by  Wallace  (we  will  refer 
to  this  version  as  P-EFC3+DAC2): 


1 .  Anticipating  the  detection  of  dead-ends.  Since  icu  and  dacn  refer  to  different  sets  of 
constraints,  we  can  take  their  addition  as  the  minimum  number  of  inconsistencies 
that  the  assignment  of  value  /  to  variable  Xi  will  produce  if  that  assignment  is 
eventually  done.  Therefore  we  can  replace  the  two  sums  of  minimum  IC  and  DAC, 

^  mink  {iCjfc}+  X  {dacjk] 
jeF  jeF 


by  the  sum  of  minimum  (IC+DAC), 


+  dacjk} 


as  a  lower  bound  of  the  effect  of  assigning  future  variables.  Note  that, 


mink  [iCjk} 


min;t  [dacjk] 


i 


minjt  {icjk  +  dacjk) 


so  the  use  of  this  expression  will  be,  at  least,  as  effective  as  the  previous  one.  In 
the  P-EFC3  algorithm,  this  means  to  substitute  lines  12  and  14  by. 


12  if  (new_distance+dacii  +  ^  mink{  ic  j]^+dac  j^}  <  best_distance) 

14  if  (and  new_distance  +  X  mink{ ic j^+dac j)^ }  <  best_distance) 

jeF 

2.  Increasing  the  number  of  unfeasible  values.  So  far,  a  value  k  of  n  future  variable  Xj 
is  pruned  when  the  sum  of  the  distance  of  the  current  partial  assignment  plus  the 
number  of  inconsistencies  caused  by  the  assignment  of  k  to  Xj  is  no  lower  than 
the  best  distance.  In  addition,  we  can  include  the  minimum  number  of 
inconsistencies  which  will  occur  among  the  rest  of  future  variables,  no  matter 
which  value  will  be  finally  assigned  to  Xj.  This  minimum  number  is  as  follows. 


Icpq  +daCpq} 


Therefore,  lines  23  and  26  can  be  replaced  by. 
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23  if  (new_distance  +  dacii  +  icj^  +  dacj]^  +  icpg+dacpg}  > 

p€  F“]] 

best_distance)  then  prune (Xj,k) 

26  if  (new_distance  +  ic^y.  +  dacj^  +  X  minq{ iCpg+daCpa)  > 

p€P-j 

best_distance)  then  prune (Xj,k) 

Inside  the  look_ahead  procedure  some  IC  may  be  incremented,  causing  some 
increase  in  ^mmi^{icjk+dacjk} .  Nevertheless,  in  our  implementation  this 

expression  is  not  updated  during  the  lookjahead  call;  it  is  computed  only  once  at 
the  beginning  of  lookjahead,  and  its  value  is  maintained  although  it  may  become 
not  updated.  However,  the  algorithm  is  still  correct  because  IC  propagation  can 
only  increase  this  expression,  and  we  use  it  as  a  lower  bound.  It  is  an  open 
question  whether  the  continuous  updating  of  this  sum  of  minima  is  cost  effective 
with  respect  to  our  implementation.  Note  that  the  difference  consists  in  whether 
this  expression  is  a  local  variable  or  a  function  call  inside  of  lookjahead. 

3.  Value  ordering.  We  consider  that  the  information  contained  in  DAC  is 
complementary  with  the  contained  in  IC,  and  they  can  be  added  to  estimate  the 
goodness  of  a  value.  In  our  implementation  we  order  values  by  increasing 
IC+DAC  as  a  combination  of  the  heuristics  presented  in  [Freuder  and  Wallace,  92] 
where  values  are  ordered  by  increasing  IC,  and  [Wallace,  94]  where  values  are 
ordered  by  increasing  DAC  or  ACC  (if  it  is  computed).  Note  that  value  ordering  is 
done  under  a  heuristic  criterion  and  its  benefit  cannot  be  guaranteed  in  general. 

The  three  aspects  where  DAC  can  improve  the  performance  of  P-EFC3  have  been 
introduced  separately,  but  they  are  closely  related.  If  pruning  is  more  effective,  the 
sum  of  minimum  icjk+dacjk  is  higher  and  dead-ends  are  detected  earlier.  If  values  are 
considered  in  a  better  order,  better  upper  bounds  are  sooner  established,  pruning  is 
more  effective  and  dead-ends  are  found  at  higher  levels  of  the  tree.  Therefore,  it  is 
expected  that  the  addition  of  these  features  will  magnify  its  effect  in  the  performance 
of  the  algorithm. 


5  Static  Variable  Ordering  Heuristics 

It  is  wen  known  that  the  use  of  heuristics  for  variable  and  value  ordering  is  of  great 
importance  in  MAX-CSP  algorithms.  A  right  choice  of  heuristics  can  produce  large 
savings  in  the  search  effort.  The  use  of  DAC  requires  a  static  ordering  of  variables 
because  DAC  can  only  be  used  if  variables  are  considered  for  instantiation  in  the  same 
order  than  DAC  were  computed.  For  this  reason,  we  analyze  different  S  VO  heuristics 
to  be  used  with  P-EFC3+DAC2.  First,  let  us  consider  the  following  observation, 

Observation  1:  Given  a  SVO  {Xi,  X2,...,  X„}  used  by  P-EFC3  (either  using  DAC 
or  not)  with  values  ordered  by  increasing  IC  (or  IC+DAC),  P-EFC3  visits  at  most 
0(m^  nodes,  where  k  is  the  index  of  the  first  variable  in  the  ordering  from  which  no 
constraint  exists  between  variables  which  are  posterior  to  Xjt  in  the  ordering. 
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Proo/ (sketched):  At  level  k  there  is  no  constraint  among  future  variables,  so  dacji  =  0 
yj>k  and  iCji  is  exactly  the  number  of  inconsistencies  that  value  I  will  have  if  it  is 
assigned  to  Xj.  ^  imnq[icjq  +  dacjq]  is  exactly  the  number  of  inconsistencies  that 

future  variables  will  produce  if  they  are  instantiated  with  the  values  that  minimize 
this  expression,  which  is  the  best  that  can  be  done  extending  the  current  partial 
assignment.  When  visiting  a  node  at  level  k,  every  constraint  has  become  explicit  and 
every  potential  inconsistency  has  been  propagated  into  IC  of  future  variables. 
Therefore,  the  algorithm  will  not  proceed  to  deeper  levels  unless  the  best  current 
solution  can  be  improved.  In  this  case,  due  to  the  value  ordering  criteria  used,  level  n 
will  be  reached  from  level  k  without  backtracking  and  a  new  upper  bound  will  be 
established.  The  algorithm  will  return  to  level  k  without  attempting  new  assignments 
because  the  upper  bound  becomes  smaller  or  equal  than  the  lower  bound.  Therefore, 
the  only  exponential  growth  of  the  search  space  is  in  the  first  levels  of  the  tree.  □ 
This  observation  shows  how  to  determine  for  P-EFC3  (with  or  without  DAC) 
backtracking  free  areas  of  the  search  space.  It  can  be  extended  to  all  P-EFC  algorithms 
and  the  proof  remains  the  same.  In  the  following,  we  discuss  two  objectives  that  a 
S  VO  should  pursue  to  take  the  maximum  advantage  of  P-EFC3+DAC2. 

1.  Based  on  Observation  1,  a  first  objective  is  selecting  a  SVO  which  minimizes  the 
k  in  order  to  stop  the  possible  exponential  growth  of  the  search  space  at  higher 
levels  of  the  tree. 

2.  The  efficiency  of  branch  and  bound  largely  depends  on  the  quality  of  its  lower 
bound.  At  high  levels  of  the  tree  it  is  desirable  to  have  high  lower  bounds  to 
perform  an  early  detection  of  dead  ends.  In  our  algorithm  the  lower  bound  is 
computed  using  distance,  IC  and  DAC.  Trying  to  increase  the  lower  bound  at  early 
levels  of  the  tree,  we  can  order  variables  according  to  three  different  criteria: 

2.1.  Selecting  first  variables  with  a  high  expected  contribution  to  distance. 

2.2.  Selecting  first  variables  with  a  high  expected  contribution  to  IC  of  futures. 

2.3.  Selecting  variables  in  such  an  order  that  DAC  contribution  is  maximized. 

With  these  objectives  in  mind  we  analyze  several  SVO  heuristics,  some  of  them 
based  on  the  graph  topology.  The  degree  of  a  variable  Xi  is  the  number  of  variables 
constrained  with  it.  Given  an  ordering,  we  define  the  forward  degree  of  Xi  as  the 
number  of  variables  constrained  with  X^  and  appearing  after  it  in  the  ordering. 
Conversely,  the  backward  degree  of  Xi  is  the  number  of  variables  constrained  with  Xi 
and  appearing  before  it  in  the  ordering  (this  concept  is  also  called  the  width  of  a 
variable).  Obviously,  the  sum  of  forward  and  backward  degrees  of  a  variable  is  the 
variable  degree.  Variables  can  be  ordered  by: 

1 .  Decreasing  backward  degree  (BD).  This  heuristic  was  already  proposed  in  [Wallace 
and  Freuder,  93],  and  it  considers  first  variables  most  constrained  with  past 
variables.  When  one  of  these  variables  is  selected,  it  is  expected  that  its  values  will 
have  high  IC  (from  the  high  level  of  connection  with  past  variables),  so  after 
variable  assignment  the  distance  of  the  current  partial  assignment  is  likely  to 
increase  (objective  2.1).  The  main  disadvantage  of  this  heuristic  is  the  lack  of 
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procedure  forward_degree_SVO 

V:={Xi, 

while  V^O  do 

select  as  the  next  variable  the  one  sharing  most 
constraints  with  the  rest  of  variables  in  V 
remove  it  from  V 

endwhile 

endprocedure _ 

Fig.  2.  Pseudo-code  for  computing  FD  SVO  heuristic. 

information  in  the  first  levels  of  the  tree. 

2.  Decreasing  forward  degree  (FD).  It  considers  first  variables  most  constrained  with 
future  variables.  This  heuristic  approximates  objective  1,  because,  if  a  variable  Xi 
is  constrained  with  p  future  variables  and  we  want  to  have  unconstrained  future 
variables,  we  must  either  remove  Xi  or  the  other  p  variables.  We  will  probably 
achieve  sooner  the  goal  if  we  remove  X/.  This  heuristic  also  approximates  the 
objective  2.2,  especially  in  the  first  levels  of  the  search  tree,  where  IC  are  low  and 
it  may  be  better  to  select  variables  highly  constrained  with  future  variables  to 
maximize  the  propagation  of  IC  towards  future  variables.  Figure  2  shows  the 
pseudocode  for  computing  FD. 

3.  Decreasing  degree  (DG).  This  heuristic  was  already  proposed  in  [Wallace  and 
Freuder,  93],  and  it  can  be  seen  as  a  combination  of  the  two  previous,  BD  and  FD. 
At  first  levels  of  the  tree,  DG  selects  variables  highly  constrained  with  future 
variables  (FD  dominates),  but  at  deep  levels  DG  selects  variables  highly  connected 
with  past  variables  (BD  dominates).  Regarding  the  above  mentioned  objectives, 
DG  gradually  combines  the  approximation  to  objectives  I  and  2.2  (FD 
component)  with  objective  2.1  (BD  component). 

4.  Decreasing  ACC  mean  (AC).  ACC  were  introduced  in  [Freuder  and  Wallace,  92]. 
Here,  we  consider  it  as  a  way  to  maximize  the  DAC  contribution  to  the  lower 
bound  (objective  2.3).  Variables  with  high  ACC  will  probably  have  also  high 
DAC  if  they  are  considered  first  in  the  ordering,  because  only  those  arc- 
inconsistencies  referring  to  prior  variables  will  not  be  in  their  DAC. 

All  these  heuristics  can  be  combined,  priorizing  one  and  using  another  (or  even  a 
third)  to  break  ties  [Wallace  and  Freuder,  93].  In  the  following  Section,  we  provide 
empirical  results  of  several  combinations  of  heuristics  for  random  CSP  classes. 


6  Experimental  Results 

We  have  evaluated  empirically  the  proposed  algorithms  and  SVO  heuristics  with  fixed 
and  variable  tighmess  random  CSP.  A  fixed  tightness  random  problem  is  characterized 
by  <n,m,pi,p2>  where  n  is  the  number  of  variables,  m  is  the  number  of  values  for 
each  variable,  pi  is  the  graph  connectivity  as  the  proportion  of  existing  constraints 
(the  number  of  constrained  variable  pairs  is  exactly  pin(n-l)/2),  and  P2  is  the 
constraint  tightness  as  the  proportion  of  forbidden  value  pairs  between  two 
constrained  variables  (the  number  of  forbidden  value  pairs  is  exactly  P2tti^^.  The 
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constrained  variables  and  their  nogoods  are  randomly  selected  [Smith,  94;  Prosser, 
94].  A  variable  tightness  random  problem  is  characterized  by  <n,m,pi,p2^^-^,P2^^^> 
where  the  tighmess  is  randomly  chosen  in  the  interval  for  each  pair  of 

constrained  variables  [Larrosa  and  Meseguer,  95].  Four  sets  of  problems  were 
generated: 

1.  Fixed  tightness  <15,5,/7i,p2>»  with  pi  taking  values  25/105,  50/105,  75/105 
and  105/105,  and  p2  taking  values  1/25,  2/25,  ...,  25/25. 

2.  Fixed  tightness  <10,10,pi,p2>»with  pi  taking  values  15/45,  25/45,  35/45  and 
45/45,  and  P2  taking  values  1/100,  2/100,  ...,  100/100. 

3.  Variable  tightness  <15,5,pi,l/25,24/25>,  with  pi  taking  values  25/105, 
45/105,  65/105,  85/105  and  105/105 

4.  Variable  tightness  <10,10,pi^l/100,99/100>,  with  pi  taking  values  15/45, 
25/45,  35/45  and  45/45. 

generating  for  each  parameter  setting  50  problems,  forming  four  sets  of  5000,  20000, 
200  and  200  instances  respectively.  They  have  been  solved  with  an  upper  bound  in 
the  search  effort  of  40,000,000  consistency  checks.  All  algorithms  were  implemented 
in  C  and  run  on  a  SUN  SparkStation  20. 

The  first  experiment  was  devised  to  evaluate  the  effect  of  using  DAC  with  P- 
EFC3  and  the  benefits  of  our  implementation.  We  solved  the  class  of  <n=15,  m=5> 
fixed  tightness  problems  using  plain  P-EFC3,  P-EFC3+DAC1  and  P-EFC3+DAC2, 
ordering  values  by  increasing  IC,  DAC  and  IC+DAC  respectively.  Variables  were 
ordered  lexicographically.  The  average  search  effort,  as  the  number  of  consistency 
checks,  appears  in  Fig.  3.  It  can  be  observed  that  the  most  efficient  algorithm  is  P- 
EFC3+DAC2,  which  saves  up  to  70%  of  consistency  checks  performed  by  P- 
EFC3+DAC1.  The  algorithms  using  DAC  are  clearly  better  than  pure  P-EFC3.  In 
addition,  there  is  a  different  pattern  in  the  difficulty  of  problems  depending  on  whether 
DAC  are  used  or  not.  If  DAC  are  not  used,  problems  become  harder  in  average  when 
tightness  is  increased.  If  DAC  are  used,  we  observe  an  easy-hard-easy  pattern  in  the 
search  effort.  The  left  easy  part  corresponds  to  problems  with  low  or  intermediate 
tightness.  They  have  solutions  satisfying  every  (or  almost  every)  constraint,  and  P- 
EFC3  (either  with  or  without  DAC)  does  not  have  to  invest  much  effort  to  find  them. 
The  right  easy  part  corresponds  to  problems  with  very  high  tightness.  They  have 
many  arc-inconsistencies,  DAC  take  high  values  and  have  an  important  contribution 
to  the  lower  bound,  causing  P-EFC3  with  DAC  to  perform  a  more  efficient  pruning 
than  pure  P-EFC3.  The  hard  part,  where  the  peak  in  the  search  effort  occurs, 
corresponds  to  problems  with  high  tightness.  For  these  problems,  the  DAC  effect  is 
not  enough  to  prune  at  high  levels  of  the  tree.  For  a  more  comprehensive  description 
of  this  phenomenon  see  [Larrosa  and  Meseguer,  96]. 

The  second  experiment  was  devoted  to  evaluate  single  SVO  heuristics.  Since  all 
heuristics,  except  ACC  mean,  are  based  on  graph  topology,  full  connectivity 
problems  were  discarded  in  this  experiment.  We  solved  the  class  of  <n=15,m=5>  fixed 
tightness  problems  using  P-EFC3+DAC2  and  the  following  SVO  heuristics:  BD, 
FD,  DG  and  AC,  breaking  ties  lexicographically.  The  results  appear  in  Figure  4, 
from  which  we  observe  that  BD,  without  a  second  criteria  to  break  ties,  is  a  bad 
heuristic  (as  was  already  affirmed  in  [Wallace  and  Freuder,  93]).  AC  gives  a  good 
advantage  with  respect  to  BD.  FD  and  DG  are  the  best  heuristics,  without  a  clear 
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difference  between  them.  The  dominance  of  FD  and  DG  can  be  justified  by  the  fact 
that  they  include  several  objectives  for  SVO  heuristics  (see  Section  5).  The  savings 
caused  by  SVO  heuristics  are  really  important;  this  can  be  realized  comparing  Figures 
3  and  4. 

The  third  experiment  was  devoted  to  evaluate  combinations  of  two  SVO 
heuristics,  where  the  second  is  used  to  break  ties.  We  selected  a  second  criterion 
complementary  to  the  first  with  respect  to  the  objectives  discussed  in  Section  5.  We 
solved  the  class  of  <n=15,m=5>  fixed  tightness  problems  using  P-EFC3+DAC2, 
testing  the  following  combinations:  FD/BD,  FD/AC,  BD/FD,  BD/AC  and  DG/AC. 
The  results  appear  in  Figure  5,  from  which  we  observe  that  BD  as  first  criterion,  with 
a  second  heuristic  to  break  ties  (FD  or  AC),  is  a  bad  SVO  heuristic.  This  observation 
does  not  match  with  the  experiments  presented  in  [Wallace  and  Freuder,  93],  although 
this  discrepancy  may  be  explained  by  the  differences  in  the  problems  to  solve  (in 
[Wallace  and  Freuder,  93]  random  problems  had  variable  domain  and  tightness).  With 
respect  to  the  three  other  combinations,  there  is  not  a  large  difference  among  them;  it 
seems  that  FD/BD  slightly  dominates  in  the  peak  of  the  search  effort.  Comparing 
Figures  4  and  5,  we  observe  that  the  addition  of  a  second  criterion  for  FD  and  DG 
causes  only  small  improvements  in  their  performance. 

The  fourth  experiment  aimed  at  comparing  static  and  dynamic  ordering  heuristics. 
We  solved  the  class  of  <n=10,m=10>  fixed  tighmess  problems  with  P-EFC3  using 
the  largest  mean  DVO  heuristic  (LM,  it  selects  the  variable  with  the  largest  mean  of 
IC  among  its  feasible  values  [Freuder  and  Wallace,  92]),  and  with  P-EFC3+DAC1 
and  P-EFC3+DAC2  using  both  the  FD/BD  heuristic.  The  results  appear  in  Figure  6, 
from  which  we  observe  that  DAC  usage  plus  FD/BD  dominates  clearly  pure  P-EFC3 
with  LM.  In  addition,  they  confirm  that  P-EFC3+DAC2  dominates  P-EFC3+DAC1 
when  the  FD/BD  heuristic  is  used. 

The  fifth  experiment  was  devoted  to  check  the  performance  of  the  algorithms  on 
variable  tightness  problems.  We  solved  the  classes  of  <n=10,m  =  10>  and 
<rt=15,m=5>  variable  tightness  problems  with  P-EFC3  using  LM  heuristic,  and  P- 
EFC3+DAC1  and  P-EFC3+DAC2  using  FD/BD.  The  results  appear  in  Figure  7, 
where  the  horizontal  axis  represents  varying  connectivity.  From  these  results  we 
observe  that  P-EFC3+DAC2  surpasses  clearly  P-EFC3+DAC1  in  the  whole  set  of 
problems,  which  indicates  that  the  improvement  in  performance  showed  for  fixed 
tightness  does  not  depend  on  constraint  homogeneity.  In  addition,  we  see  that  P- 
EFC3  with  LM  surpasses  P-EFC3+DAC1  in  one  problem  class.  Further  experiments 
on  smaller  variable  tighmess  intervals  are  required,  to  qualify  properly  this 
phenomenon. 

In  summary,  experimental  results  confirm  clearly  the  practical  benefits  of  the 
proposed  algorithm,  P-EFC3+DAC2,  for  fixed  and  variable  tighmess  problems.  The 
benefits  over  a  previous  approach  are  maintained  when  using  SVO  heuristics,  which 
in  addition,  can  largely  improve  the  algorithm  performance.  The  analysis  of  heuristics 
based  on  SVO  objectives  (Section  5)  provided  some  explanation  on  their  relative 
efficiency.  Results  suggest  that  the  combination  FD/BD  is  the  option  of  choice, 
because  it  is  slightly  superior  to  other  combinations  and  it  is  quite  easy  to  compute. 
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Fig.  3.  Number  of  consistency  checks  for  the  class  of  <n=l5,  m=5>  fixed  tightness 
problems  solved  with  different  algorithms  with  lexicographical  variable  ordering. 


Fig.  4.  Number  of  consistency  checks  for  the  class  of  <n=15,  m=5>  fixed  tightness 
problems  solved  with  P-EFC3+DAC2  with  different  SVO  heuristics. 
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Fig.  5.  Number  of  consistency  checks  for  the  class  of  <n=15,  m=5>  fixed  tightness 
problems  solved  with  P-EFC3+DAC2  with  different  SVO  heuristic  combinations. 


Fig.  6.  Number  of  consistency  checks  for  the  class  of  <n=10,  m=10>  fixed  tightness 
problems  solved  with  different  algorithms  and  heuristics. 
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Fig.  7.  Number  of  consistency  checks  for  two  classes  of  variable  tightness  problems 
solved  with  different  algorithms. 


7  Conclusions 

From  this  work  we  extract  the  following  conclusions.  First,  information  coming 
exclusively  from  future  variables  is  very  valuable  for  improving  the  performance  of 
MAX'CSP  solving  algorithms.  This  idea  was  first  developed  in  the  interesting  work 
of  [Wallace,  94]  introducing  the  use  of  DAC  counts,  which  has  been  enhanced  in  the 
present  paper.  Second,  considering  this  information  from  future  variables,  we  have 
analyzed  some  S  VO  heuristics  already  proposed  and  we  have  generated  new  heuristics 
which  have  been  found  effective.  And  third,  although  DAC  use  could  be  considered  in 
principle  as  a  technical  refinement  for  existing  algorithms  without  any  other 
implication,  its  usage  has  allowed  us  to  discover  a  region  of  very  tight  problems 
which  are  easy  for  MAX-CSP. 
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Abstract.  We  consider  the  Weighted  Constraint  Satisfaction  Problem 
which  is  a  central  problem  in  Artificial  Intelligence.  Given  a  set  of  vari¬ 
ables,  their  domains  and  a  set  of  constraints  between  variables,  our  goal 
is  to  obtain  an  assignment  of  the  variables  to  domain  values  such  that 
the  weighted  sum  of  satisfied  constraints  is  maximized.  In  this  paper,  we 
present  a  new  approach  based  on  randomized  rounding  of  semidefinite 
programming  relaxation.  Besides  having  provable  worst- case  bounds,  our 
algorithm  is  simple  and  efficient  in  practice,  and  produces  better  solu¬ 
tions  than  other  polynomial- time  algorithms  such  as  greedy  and  random¬ 
ized  local  search. 


1  Introduction 

All  instance  of  the  binary  Weighted  Constraint  Satisfaction  Problem  (W-CSP) 
is  defined  by  a  set  of  variables,  their  associated  domains  of  values  and  a  set  of 
binary  constraints  governing  the  assignment  of  variables  to  values.  Each  con¬ 
straint  is  associated  with  a  positive  integer  weight.  The  output  is  an  assignment 
which  maximizes  the  weighted  sum  of  satisfied  constraints.  W-CSP  is  a  general¬ 
ization  of  important  combinatorial  optimization  problems  such  as  the  Maximum 
Cut  Problem.  Many  real-world  problems  can  be  represented  as  W-CSP,  among 
which  are  scheduling  and  timetabling  problems.  In  scheduling  for  example,  our 
task  is  to  assign  resources  to  jobs  under  a  set  of  constraints,  some  of  which  are 
more  important  than  others.  Most  often,  instances  are  over- con  strained  and  no 
solution  exists  that  satisfies  all  constraints.  Thus,  our  goal  is  to  find  an  assign¬ 
ment  which  maximizes  the  weights  of  the  satisfied  constraints. 

Finding  optimal  solutions  of  W-CSP  is  known  to  be  computationally  hard. 
In  the  CSP  research  community,  work  in  W-CSP  is  not  as  abundant  as  work 
in  the  standard  CSP.  Freuder  and  Wallace  [3,  4]  gave  the  first  formal  definition 
of  PCSP  which  is  a  special  case  of  W-CSP  having  unit  weights.  For  PCSP,  the 
objective  is  to  satisfy  as  many  constraints  as  possible.  Freuder  and  Wallace  pro¬ 
posed  a  polynomial  time  algorithm  based  on  reverse  breadth-first  search  to  solve 
PCSP  whose  underlying  constraint  network  is  a  tree.  For  the  general  PCSP,  they 
proposed  a  general  framework  based  on  branch-and-bound  and  its  enhancements 
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in  [17,  16].  Incomplete  algorithms  which  yield  near-optimal  or  approximate  so¬ 
lutions  have  also  been  investigated,  including  heuristic  repair  methods  [18],  the 
connectionist  architecture  GENET  [12]  and  guided  local  search  [14]. 

Our  work  is  motivated  mainly  by  the  potential  application  of  W-CSP  in 
scheduling.  With  the  rapid  increase  in  the  speed  of  computing  and  the  growing 
need  for  efficiency  in  scheduling,  it  becomes  increasingly  important  to  explore 
ways  of  obtaining  better  schedules  at  some  extra  computational  cost,  short  of 
going  all  the  way  towards  the  usually  futile  attempt  of  finding  a  guaranteed 
optimal  schedule.  Our  paper  describes  a  new  approach  meant  to  achieve  this 
goal. 

Another  highlight  of  our  approach  is  that  the  solution  obtained  has  a  provable 
worst-case  bound  in  terms  of  weight  when  compared  with  the  optimal  solution. 
This  contrasts  with  conventional  incomplete  algorithms  which  are  empirically 
good  but  are  not  guaranteed  to  perform  well  in  the  worst  case.  In  [8],  we  per¬ 
formed  a  worst-case  analysis  of  local  search  for  PCSP.  In  [9],  Lau  and  Watanabe 
obtained  lower  and  upper  bounds  of  various  rounding  methods  for  W-CSP.  In 
this  paper,  our  emphasis  is  to  apply  the  theory  proposed  in  [9]  to  practice:  we 
will  give  a  careful  account  of  the  mathematical  modelling,  the  algorithm  de¬ 
sign,  as  well  as  experimental  performance  of  our  algorithm.  The  experimental 
results  turn  out,  to  our  pleasant  surprise,  to  be  much  stronger  than  the  theoret¬ 
ical  worst-case  bound.  Nevertheless,  besides  being  theoretically  interesting,  the 
knowledge  of  the  worst-case  performance  gives  us  some  peace  of  mind:  that  our 
algorithm  will  never  perform  embarrassingly  poorly.  This  is  an  important  factor 
to  consider  in  scheduling  critical  resources. 

Randomized  Rounding  Our  approach  is  heavily  based  on  the  notion  of  random¬ 
ized  rounding.  Randomized  algorithms  have  proved  to  be  powerful  in  the  design 
of  approximate  algorithms  for  combinatorial  optimization  problems.  An  inter¬ 
esting  and  efficient  algorithmic  paradigm  is  that  of  randomized  rounding,  due 
to  Raghavan  and  Thompson  [11].  The  key  idea  is  to  formulate  a  given  optimiza¬ 
tion  problem  as  an  integer  program  and  then  find  an  approximate  solution  by 
solving  a  polynomial-time  solvable  convex  mathematical  program  such  as  a  lin¬ 
ear  program.  The  linear  program  must  constitute  a  “relaxation”  of  the  problem 
under  consideration,  i.e.  all  integer  solutions  are  feasible  for  the  linear  program 
and  have  the  same  value  as  they  do  in  the  integer  program.  One  easy  way  to 
do  this  is  to  drop  the  integrality  conditions  on  the  variables.  Given  the  optimal 
fractional  solution  of  the  linear  program,  the  question  is  how  to  find  a  good 
integer  solution.  Traditionally,  one  rounds  the  variables  to  the  nearest  integers. 
Randomized  rounding  is  a  technique  which  treats  the  values  of  the  fractional 
solution  as  a  probability  distribution  and  obtains  an  integer  solution  using  this 
distribution.  Raghavan  and  Thompson  showed,  using  basic  probability  theory, 
that  the  values  chosen  under  the  distribution  do  in  fact  yield  a  solution  near  the 
expectation,  thus  giving  good  approximate  solutions  to  the  integer  program. 

Essentially,  we  will  solve  W-CSP  using  what  is  known  as  randomized  rounding 
of  a  semidefinite  program  relaxation.  A  semidefinite  program  is  the  optimization 
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problem  of  a  linear  function  of  a  symmetric  matrix  subject  to  linear  equality 
constraints  and  the  constraint  that  the  matrix  be  positive  semidefinite.  Semidef- 
inite  programming  is  a  generalization  of  linear  programming  and  a  special  case 
of  convex  programming.  The  simplex  method  can  be  generalized  to  semidefinite 
programs.  By  interior  point  methods,  one  can  show  that  semidefinite  program¬ 
ming  is  solvable  in  polynomial  time  under  some  realistic  assumptions.  Semidef¬ 
inite  programming,  like  linear  programming,  has  been  an  active  research  topic 
among  the  Operations  Research  community;  for  details,  see  a  good  survey  writ¬ 
ten  by  Alizadeh  [1].  In  practice,  there  are  solvers  which  yield  solutions  quickly 
for  reasonably  large  instances,  as  our  experiments  would  show.  The  idea  is  to 
represent  W-CSP  as  a  quadratic  integer  program  and  solve  a  corresponding  in¬ 
stance  of  semidefinite  programming.  The  solution  returned  by  the  semidefinite 
program  is  then  rounded  to  a  valid  assignment  by  randomized  rounding.  This 
approach  yields  a  randomized  algorithm.  To  convert  it  into  a  deterministic  al¬ 
gorithm,  we  apply  the  method  of  conditional  probabilities  which  is  a  well-known 
probabilistic  method  in  combinatorics.  This  method  is  clearly  explained  in  the 
text  of  Alon  and  Spencer  [2]  and  the  idea  will  be  adapted  for  our  purpose  in  this 
paper. 

Hence,  our  approach  offers  a  polynomial- time  algorithm,  which  can  be  effi¬ 
ciently  implemented.  Our  experiments  illustrate  that  this  approach  can  handle 
problems  of  sizes  beyond  what  enumerative  search  algorithms  can  handle,  and 
thus  is  a  candidate  for  solving  real-world  large-scale  problem  instances.  Recently, 
Goemans  and  Williamson  applies  a  similar  approach  to  find  approximate  solu¬ 
tions  for  the  Maximum  Cut  Problem  [6].  Besides  having  a  strong  provable  worst- 
case  bound,  their  computational  experiments  show  that,  on  a  number  of  different 
types  of  random  graphs,  their  algorithm  yields  solutions  which  are  usually  within 
4%  from  the  optimal  solution. 

This  paper  is  organized  as  follows.  Section  2  gives  the  definitions  and  nota¬ 
tions  which  will  be  used  throughout  the  paper.  In  section  3,  we  introduce  the 
method  of  conditional  probabilities  and  derive  a  linear-time  greedy  algorithm 
which  can  be  used  to  derandomize  our  randomized  rounding  algorithms.  We 
prove  that  a  naive  application  of  the  greedy  algorithm  always  returns  a  solution 
whose  weight  is  guaranteed  to  be  a  fraction  of  s  times  the  total  weight,  where 
0  <  s  <  1  is  the  strength}  of  the  constraints.  In  section  4,  we  show  how  to 
formulate  W-CSP  by  quadratic  integer  programs.  We  discuss  how  randomized 
rounding  can  be  used  to  yield  solutions  which  have  a  constant  worst-case  bound 
for  domain  size  k  <  S.  A  more  sophisticated  rounding  scheme  due  to  Goemans 
and  Williamson  [6]  will  be  discussed,  which  yields  a  better  worst-case  bound  for 
domain  size  2.  However,  their  algorithm  cannot  be  de- randomized  in  a  practi¬ 
cal  sense  to  date.  In  section  5,  we  compare  the  performance  of  our  algorithm 
with  other  approaches  experimentally  on  random  W-CSP  instances.  For  all  300 
instances,  we  are  able  to  obtain  solutions  whose  values  are  within  4%  from  the 
optimal.  This  is  significantly  better  than  the  other  polynomial-time  algorithms 
under  comparison. 


^  For  standard  unweighted  CSP,  this  measure  is  often  known  as  looseness. 
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2  Preliminaries 

Let  V  =  n}  be  a  set  of  variables.  Each  variable  has  a  domain  which 

contains  the  set  of  values  that  can  be  assigned.  For  simplicity,  we  assume  that  all 
domains  have  fixed  size  k  and  are  equal  to  the  set  K  =  {1, . . . ,  fc}.  A  constraint 
between  two  variables  i  and  I  is  a  binary  relation  over  K  x  K  which  defines 
the  pairs  of  values  that  i  and  I  can  take  simultaneously.  Given  an  assignment 
cr  :  V  — >  AT,  the  constraint  is  said  to  be  satisfied  iff  the  pair  is  an 

element  of  the  relation.  The  Weighted  Constraint  Satisfaction  Problem  (W-CSP) 
is  defined  by  a  set  V  of  variables,  a  collection  M  of  constraints,  integer  k,  and  a 
weight  function  w  :  M  — ^  Z^.  The  output  is  an  assignment  a  :  V  — ^  K  such 
that  weighted  sum  of  satisfied  constraints  (or  simply  weight)  is  maximized. 

Denote  by  W-CSP (/u)  the  class  of  instances  with  domain  size  k.  For  each 
constraint  j  G  Af,  let  Wj^  Rj  and  Sj  =  denote  its  weight,  relation  and 

strength  (looseness)  respectively;  let  a.j  and  denote  the  indices  of  the  two 
variables  incident  on  j;  and  let  Cj(u,v)  =  1  if  {u,v)  G  Rj  and  0  otherwise.  That 
is,  Cj(u,v)  indicates  whether  the  value  pair  (u^v)  is  consistent  in  constraint  j. 
Let  s  =  denote  the  strength  of  a  W-CSP  instance  (i.e. 

the  weighted  average  strengths  of  all  its  constraints).  Note  that  s  >  because 
a  constraint  relation  contains  at  least  1  out  of  the  k^  possible  pairs.  A  W-CSP 
instance  is  satisfiable  iff  there  exists  an  assignment  which  satisfies  all  constraints 
simultaneously. 

A  quadratic  integer  program  (QIP)  has  the  form: 
maximize 

subject  to  CijXi  =  dj  for  all  constraints  j 

Xi  integer,  for  all  variables  L 

We  say  that  a  maximization  problem  P  can  be  approximated  within  0  <  e  <  1 
iff  there  exists  a  polynomial-time  algorithm  A  such  that  for  all  input  instances 
y  of  P,  the  ratio  A{y)/OPT(y)  is  at  least  e,  where  A{y)  and  OPT{y)  denote  the 
objective  value  of  the  solution  returned  by  A  and  the  optimal  objective  value  of 
y  respectively.  The  quantity  e  is  commonly  known  as  the  performance  guarantee 
or  approximation  ratio  for  P.  The  ratio  is  absolute  if  the  denominator  is  the 
maximum  possible  objective  value  instead  of  OPT(y).  In  the  case  of  W-CSP 
for  example,  the  maximum  possible  objective  value  is  the  sum  of  edge  weights, 
although  the  optimal  value  can  be  much  smaller.  Hence,  the  absolute  ratio  is 
always  a  lower  bound  of  (and  therefore  better  bound  than)  the  performance 
guarantee.  Observe  that  the  ratio  is  as  close  to  1  as  the  solution  is  close  to  an 
optimum  solution. 


3  Method  of  Conditional  Probabilities 

In  this  section,  we  introduce  the  method  of  conditional  probabilities.  We  derive 
an  efficient  linear- time  greedy  algorithm  which  can  be  used  to  derandomize  the 
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randomized  rounding  algorithm  presented  later.  We  will  also  analyse  the  worst- 
case  performance  of  a  naive  application  of  the  greedy  algorithm. 

Consider  an  instance  of  W-CSP.  Suppose  we  are  given  an  n  by  k  matrix 
n  =  (piu)  such  that  all  pi^u  €  [0..1]  and  =  1  for  all  1  <  i  <  n.  If  we 

assign  each  variable  i  independently  to  value  u  with  probability  Pi^ui  we  obtain 
a  probabilistic  assignment  whose  expected  weight  is  given  by 

W  =  tUj  xPr[constraint  j  is  satisfied]  =  E 

jGAf  jEM  u,vEK 

The  method  of  conditional  probabilities  specifies  that  there  must  exist  an 
assignment  whose  weight  is  at  least  W  and  that  such  an  assignment  can  be 
found  deterministically  in  polynomial  time  provided  that  certain  conditional 
probabilities  can  be  computed  efficiently. 

We  will  show  how  to  derive  one  such  algorithm.  Let  rj  be  a  partial  assignment 
such  that  variables  1, . . . ,  t  —  1  have  been  assigned  values,  and  variables 
are  unassigned.  Define: 


{Pi,u,  if  «  >  t 

1,  if  i  <t  —  1  and  <Ti  =  u 
0,  otherwise. 

That  is,  qi^u  is  the  probability  that  variable  i  is  assigned  value  u  with  respect 
to  the  partial  assignment  a.  Notice  that  if  a  is  completely  unassigned,  then 
qi,u  =  Pi,u  for  all  i  and  u.  The  expected  weight  of  a  is  given  by: 


^  =  E  ’"3 1  E  1  • 

jeM  \u,veK  J 

This  suggests  that  we  can  construct  a  complete  assignment  of  expected  weight 
at  least  W  iteratively  in  a  greedy  fashion:  each  time,  assign  a  variable  to  a  value 
such  that  expected  weight  of  the  resulting  partial  assignment  is  the  maximum 
over  all  partial  assignments.  The  greedy  (derandomization)  algorithm  is  codified 
as  follows: 


procedure  Greedy: 
begin 

set  W  =  W\ 
for  all  i  =  1, . . .  ,n  do 
for  bW  u  ^  K  compute 
assign  i  to  v  s.t.  Wy  is  maximized; 
set  W  =  W^; 
endfor 
end. 
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At  the  beginning  of  iteration  /,  W  denotes  the  expected  weight  of  the  partial 
assignment  where  variables  1, . . . , i  —  1  are  fixed  and  variables  . . .  ,n  are  as¬ 
signed  according  to  distribution  77.  denotes  the  expected  weight  of  the  same 
partial  assignment,  except  that  variable  i  is  fixed  to  the  value  u.  From  the  law 
of  conditional  probabilities: 

k 

U=1 

Since  we  always  i)ick  v  such  that  Wy  is  maximized.  W  is  non-decreasing  in  all 
iterations. 

Therefore,  to  obtain  assignments  of  large  weights,  the  key  factor  is  to  obtain 
the  probability  distribution  matrix  77  such  that  the  expected  weight  is  as  large  as 
possible.  In  the  following,  we  consider  the  most  naive  probability  distribution  - 
the  random,  assignment,  i.e.  for  all  i  and  u,  we  have  pi^u  =  1/A;.  By  linearity  of  ex¬ 
pectation  (i.e.  expected  sum  of  random  variables  is  equal  to  the  sum  of  expected 
values  of  random  variables),  the  expected  weight  of  the  random  assignment  is 
given  by, 

W  =  Y^Wj-s.j  =  sY^Wj. 

jeM  jeM 

That  is,  the  expected  weight  is  s  times  the  total  edge  weights,  implying  that 
W-CSP(fc)  can  be  approximated  within  absolute  ratio  s. 

For  this  naive  approach,  Wu  can  be  derived  from  W  as  follows.  Maintain  a 
vector  T  where  rj  stores  the  probability  that  constraint  j  is  satisfied  given  that 
variables  1, —  1  are  fixed  and  the  remaining  variables  assigned  randomly. 
Then,  Wu  is  just  W  offset  by  the  change  in  probabilities  of  satisfiability  of  those 
constraints  incident  to  variable  i.  More  precisely, 

Wu  =  W+  Y 

j  incident  to  i 

where  r'  is  the  new  probability  of  satisfiability  of  constraint  j.  Letting  I  be  the 
second  variable  connected  by  j,  r'-  is  computed  as  follows: 

if  I  <  i  (i.e.  I  has  been  assigned) 

then  set  r'-  to  1  if  €  Rj  and  0  otherwise 

else  set  r'-  to  the  fraction  €  K  :  (u.  v)  €  7?.j}/A;. 

Clearly,  the  computation  of  each  Wu  takes  0(7niA;)  time,  where  mi  is  the 
degree  of  variable  i.  Hence,  the  total  time  needed  is  0(Y^mih?)  =  0(?nA;^), 
which  is  linear  in  the  size  of  the  input. 

4  Randomized  Rounding  of  Semidefinite  Program 

In  this  section,  we  present  our  main  algorithm.  Essentially,  the  idea  is  to  represent 
W-CSP  as  a  QIP  and  apply  randomized  rounding  to  its  semidefinite  program¬ 
ming  relaxation.  The  resulting  randomized  algorithm  is  then  de-randomized  into 
a  deterministic  algorithm  by  the  method  of  conditional  probabilities. 
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4.1  A  Simple  and  Efficient  Rounding  Scheme 

Consider  an  instance  of  W-CSP(A;)  and  formulate  a  corresponding  0/1  QIP  in¬ 
stance  (Q)  as  follows.  For  every  variable  i  G  P,  define  k  0/1  variables  Xi^i, . . . ,  Xi^k 
in  (Q)  such  that  i  is  assigned  to  u  in  the  W-CSP  instance  iff  xi^u  is  assigned  to 
1  in  (Q). 

Q:  maximize  ^  Wjfj(x) 
j£M 

subject  to  ^2  —  1  for  i  G  y  [1] 

ueK 

Xi,u  €  {0, 1}  for  z  G  F  and  u  G  FT 

In  the  above  formulation,  fj{x)  ~  encodes  the  satisfia¬ 

bility  of  constraint  j  and  hence  the  objective  function  gives  the  weight  of  the 
assignment.  Inequality  [1]  ensures  that  every  W-CSP  variable  gets  assigned  to 
exactly  one  value. 

Next,  convert  this  0/1  QIP  into  an  equivalent  QIP  (Q')  whose  variables  takes 
values  {—1,4-1}: 

Q':  maximize  Wjfj{x) 
jeM 

subject  to  ^2  =  -(k  -2)  for  z  G  F 

ueK 

€  {-1,  +1}  for  z  G  V  and  u  ^  K 
Xq  =  -fl 

Here,  we  have  /j(x)  =  -  (l  +  The  reason  for 

introducing  a  dummy  variable  2:0  is  so  that  all  terms  occurring  in  the  formulation 
are  quadratic,  which  is  necessary  for  the  subsequent  semidefinite  programming 
relaxation. 

Having  formulated  the  W-CSP  instance  as  a  QIP,  the  next  step  is  to  find  an 
appropriate  relaxation  which  is  polynomial- time  solvable.  One  such  candidate  is 
the  linear  programming  relaxation.  It  has  been  shown  that  linear  programming 
relaxations  do  not  yield  a  strong  bound  compared  with  semidefinite  program¬ 
ming  relaxations  for  small  domain  sizes  [9].  In  the  following,  we  will  only  discuss 
semidefinite  programming  relaxations. 

The  essential  idea  is  to  coalesce  a  quadratic  term  XiXj  into  a  matrix  variable 
yi^j.  Let  Y  denote  the  (kn^l)  x  (A:n4-1)  matrix  comprising  these  matrix  variables. 
The  resulting  relaxation  problem  (P)  is  the  following: 

P:  maximize  ^  WjFj(Y) 

j€M 

subject  to  ^  ^o,iu  =  -  2)  for  z  G  V 

uek 

Viu.iu  =  1  for  z  G  V  and  u  e  K  [2] 

2/0,0  =  1 

Y  symmetric  positive  semidefinite. 
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Here,  Fj{X)  =  \  Y,u,v  Cj{v.,v)(l  +  yaju.fijv  +  yo,aju  +  yo,Pjv)- 

Hence,  we  have  a  semidefinite  program,  which  can  be  solved  in  polynomial 
time  within  an  additive  factor  (see  [1]).  By  a  well-known  theorem  in  Linear  Al¬ 
gebra,  a  t  X  t  matrix  Y  is  symmetric  positive  semidefinite  iff  there  exists  a  full 
row-rank  matrix  r  x  t  {r  <  t)  B  such  that  Y  =  (see  for  example,  [7]). 

One  such  matrix  B  can  be  obtained  in  0(7i^)  time  by  an  incomplete  Cholesky’s 
decomposition.  Since  Y  has  all  I’s  on  its  diagonal  (by  inequality  [2]),  the  de¬ 
composed  matrix  B  corresponds  precisely  to  a  list  of  t  unit- vectors  , . . . ,  Xt 
where  column  c  of  H  gives  the  vector  Xc.  Furthermore,  these  vectors  have  the 
nice  property  that  Xc*Xc'  =  2/c,c'-  The  notation  Xi  •X2  denote  the  inner  product 
of  the  vectors  Xi  and  X2. 

Domain  Size  2  and  3  We  propose  the  following  randomized  algorithm: 

1.  (Relaxation)  Solve  the  semidefinite  program  (P)  to  optimality  (within  an 
additive  factor)  and  obtain  an  optimal  set  of  vectors  X*. 

2.  (Randomized  Rounding)  Construct  an  assignment  for  the  W-CSP  instance 

as  follows.  For  each  assign  variable  i  to  value  u  with  probability  1  — 
arccos(XQ  -X*  „) 

TT 

The  Rounding  step  has  the  following  intuitive  meaning:  the  smaller  the  angle 
between  X^*^  and  Xq  ,  the  higher  the  probability  that  i  would  be  assigned  to  u. 

For  the  case  of  fc  =  2,  the  expected  weight  of  the  probabilistic  assignment 
can  be  shown  to  be  at  least  0.408  times  the  weight  of  the  optimal  solution  [9]. 
The  randomized  rounding  step  can  be  converted  into  a  deterministic  algorithm 
using  the  technique  discussed  in  section  3.  Hence,  we  can  approximate  W-CSP(2) 
within  a  worst-case  bound  of  0.408. 

For  the  case  of  fc  =  3,  the  technical  difficulty  is  in  ensuring  that  the  sum 
of  probabilities  of  assigning  a  variable  to  the  three  values  is  exactly  1.  Fortu¬ 
nately,  by  introducing  additional  valid  inequalities,  it  is  possible  to  enforce  this 
condition,  which  we  will  now  explain. 

Call  two  vectors  Xi  and  X2  opposite  if  Xi  =  —  X2. 

Lemma  1.  Given  4  unit  vectors  a,  b,  c,  d,  if 


a-64-a*C"|-a-d  =  —1 

(1) 

b-  a-^b'  c-\-b‘  d  =  —1 

(2) 

c-a-{-C'b-\-C‘d  =  —1 

(3) 

d-a-{-d'b-{-d'C  =  —1 

(4) 

then  a,  h,  c  and  d  must  form  two  pairs  of  opposite  vectors. 
Proof.  I [(3)  -I-  (4)  -  (1)  -  (2)]  gives: 


a  ■  b  =  c '  d. 
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Similarly,  one  can  show  that  a  •  c  =  b  •  d  and  a  -  d  =  b-c.  This  means  that  they 
form  either  two  pairs  of  opposite  vectors  or  two  pairs  of  equal  vectors.  Suppose 
we  have  the  latter  case,  and  w.l.o.g.,  suppose  a  =  b  and  c  =  d.  Then,  by  (1), 
a-c  =  a’d  =  -1,  implying  that  we  still  have  two  pairs  of  opposite  vectors  (a,  c) 
and  (b,d).  □ 

Now  add  the  following  set  of  4n  valid  equations  into  (Q').  For  all  i: 

=  -1 
=  -1 

-\-  Xi^2  -\-  ^uz)  -  -1 
+  ^i,3)  =  -1 

By  Lemma  1,  the  relaxation  problem  (P)  will  return  a  set  of  vectors  with  the 
property  that  for  each  there  exists  at  least  one  vector  X  e  {Xi^i,  Xi^2^  Xi^z} 
opposite  to  Xp  while  the  remaining  two  are  opposite  to  each  other.  Noting  that 
1  —  arccosCX'o  X)  _  probabilities  of  assigning  i  to  the  other  two 

values  is  exactly  1.  Thus,  we  have  reduced  the  case  of  A:  =  3  to  the  case  of  A;  =  2. 

Larger  Domain  Size  Note  that  the  above  rounding  works  only  for  cases  of  it  <  3. 
For  domain  size  greater  than  3,  we  use  the  following  rounding  strategy: 

assign  variable  i  to  value  u  with  probability - 

This  rounding  scheme  always  works  for  all  values  of  k  since  the  sum  of  proba¬ 
bilities  for  each  variable  i  is  equal  to 

1  ^ 

7i  =  l 

Unfortunately,  we  are  unable  to  date  to  analyse  the  worst  case  performance  of 
this  rounding  scheme. 

4.2  Rounding  Scheme  of  Goemans  and  Williamson 

Recently,  Goemans  and  Williamson  [6]  proposed  a  nice  rounding  scheme  for 
approximating  the  Maximum  Satisfiability  Problem.  This  scheme  can  be  adopted 
to  give  an  improved  bound  for  W-CSP(2). 

Model  a  given  instance  of  W-CSP  by  the  following  QIP.  Each  variable  has 
domain  {-1,-fl}.  In  this  way,  we  can  directly  use  Xi  €  {-1,+!}  to  indicate 
the  value  assigned  to  variable  i.  Introduce  an  additional  variable  xq.  Again,  the 
variable  i  is  assigned  to  +1  iff  =  a^o- 

Q :  maximize  ^  wj  fj(x) 
i<l 

subject  to  Xi  e  {-L+l}  for  i  e  V[j{0} 
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KSKSl 

(+1,-1) 

[anil 

■y 

MM 

not  a  constraint 

V 

V 

1  ((1  +  xoaji)  +  (1  4-  xoxi)  +  (1  -  x^Xl)) 

1  ((1  -H  xoXi)  -1-  (1  -  xoXi)  +  (1  +  Xia^/)) 

1  ((1  -  xoXi)  4-  (1  +  xoxi)  4-  (1  4-  XiXj)) 

1  ((1  -  xoasi)  4-  (1  -  xoXi)  4"  (1  -  Xia;/)) 

v/ 

V 

X 

V 

V 

X 

V 

V 

X 

v/ 

V 

V 

V 

V 

X 

X 

1  (1  +  XQXi) 

1  (1-  3)035, ) 

1  (1  4-  XtXi) 

(1  -  X^Xl) 

1  (1  -  XqXi) 

1  (1  4-  xoxi) 

v 

X 

V 

X 

v 

X 

X 

V 

X 

V 

V 

X 

X 

V 

X 

y 

X 

X 

V 

V 

y 

X 

X 

X 

J  ((1  4-  a;oa;,)  4-  (1  4-  xoa;?)  4"  (1  4-  xixj)  -  2) 

1  ((1  4-  xoa),  )  4-  (1  -  xqxj)  4-  (1  4-  XiXj)  -  2) 

1  ((1  -  xoXi)  4-  (1  4-  xoXi)  4-  (1  4-  XiXj)  -  2) 

”  ((1  —  xoXi)  4-  (1  —  xoxi)  4-  (1  4-  XiXi)  -  2) 

V 

X 

X 

X 

V 

X 

X 

X 

V 

X 

X 

X 

not  a  constraint 

Table  1.  Table  of  functions  associated  with  constraint  relations.  Assume  that  con¬ 
straint  j  is  incident  to  variables  i  and  1.  The  symbols  yj  and  X  indicate  whether  each 
value  pair  is  an  element  of  the  relation. 


where  fj(x)  encodes  the  satisfiability  of  constraint  j.  Table  1  gives  the  function 
fj  associated  with  all  16  possible  constraint  relations. 

Therefore,  the  problem  (Q)  can  be  expressed  as; 

Q':  maximize  ^  -  XiXi)  +  bii{l  +  XiXi)  -  cu] 

subject  to  Xi  ^  {  —  1,  +1}  for  i  e  V  |J{0} 

where  the  coefficients  au^bn  and  cu  are  non-negative. 

The  following  hyperplane  partitioning  algorithm  was  proposed  in  [6]: 

1.  (Relaxation)  Solve  (P)  optimally  and  obtain  an  optimal  set  of  vectors  X* . 

2.  (Randomized  Rounding)  Let  r  be  a  unit-vector  chosen  uniformly  at  random. 
Construct  an  assignment  x  for  (Q')  as  follows.  For  each  i  =  0,  ...,n,  if 
r  '  Xi  >  0,  then  set  =  +1  else  set  Xi  =  —1. 

3.  (Normalizing)  Construct  an  assignment  for  the  given  W-CSP  instance  as 
follows.  If  xo  =  +1  then  return  x  as  the  assignment,  else  (xo  =  -1)  return 
X  with  all  values  flipped  as  the  assignment. 

Basically,  the  Rounding  step  chooses  a  random  hyperplane  through  the  origin 
of  the  unit  sphere  (with  r  as  its  normal)  and  partitions  the  variables  into  those 
vectors  that  lie  on  the  same  side  of  the  hyperplane.  The  Normalizing  step  is 
needed  to  undo  the  effect  of  the  additional  variable  Xq  in  case  it  is  set  to  -1. 
Using  this  algorithm,  W-CSP(2)  can  be  approximated  within  a  worst  case  bound 
of  0.634,  which  can  be  improved  to  0.878  for  satisfiable  instances  [9]. 
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Unfortunately,  this  method  cannot  be  easily  derandomized  to  date.  A  deran¬ 
domization  method  for  the  above  algorithm  was  proposed  in  [6].  Unfortunately, 
the  method  was  discovered  to  contain  a  fatal  flaw  by  Mahajan  and  Ramesh  [10]. 
The  authors  then  presented  a  different  derandomization  approach  in  [10].  How¬ 
ever,  from  the  practical  point  of  view,  their  method  is  inefficient.  This  rounding 
strategy  will  not  be  reported  in  our  experiments. 

5  Computational  Experience 

In  this  section,  we  report  our  computational  experience.  Our  experiments  are 
conducted  on  the  SUN  Sparc  UNIX  workstation.  Random  numbers  are  generated 
using  the  standard  UNIX  long  random ()  function,  initialized  with  a  random 
seed  which  depends  on  the  time  of  the  day. 

We  naturally  wanted  to  test  our  algorithms  on  hard  W-CSP  instances.  How¬ 
ever,  we  learnt  that  for  PCSP,  there  is  no  localized  region  of  hard  problems  - 
hard  problems  are  located  throughout  the  instance  space,  with  difficulty  increas¬ 
ing  with  increasing  edge  density  of  the  constraint  graph,  increasing  tightness  of 
constraints,  and  naturally,  increasing  domain  size  [15],  Our  main  concern  is  the 
performance  of  our  algorithm  against  other  incomplete  algorithms  which  run  in 
polynomial  time.  The  measure  of  performance  is  the  approximation  ratio.  Since 
it  is  time-consuming  to  compute  optimal  solutions  for  reasonably  large  instances, 
we  experiment  on  satisfiable  instances  whose  optimal  value  is  always  the  sum 
of  all  edge  weights.  This  allows  us  to  compute  the  approximation  ratio  without 
obtaining  optimal  solutions. 

Generation  of  Random  Instances  With  the  above  considerations,  we  generate 
W-CSP  instances  of  n  variables  from  a  distribution  parameterized  by  the  edge 
probability  0  <  q  <  1  and  the  consistency  probability  0  <  A  <  1  according  to  the 
following  rules: 

1.  Generate  a  random  graph  G  of  n  nodes  such  that  an  edge  (i.e.  constraint) 
exists  between  any  two  variables  with  probability  q. 

2.  To  generate  a  satisfiable  instance,  we  first  generate  a  random  assignment 

a.  For  each  edge  (21,^2)  in  <^5  construct  the  constraint  relation  as  follows. 
Insert  the  value  pair  with  probability  1  and  all  other  ~  I  pairs 

with  probability  A.  To  generate  a  non-satisfiable  instance,  we  simply  insert 
the  value  pair  with  probability  A. 

3.  Edge  weights  are  randomly  generated  in  the  range  [0..999]. 

This  generation  method  has  been  used  by  others  and  an  online  (unweighted) 
implementation  is  in  [13]. 

Algorithms  Four  algorithms  are  compared.  Greedy  LS  refers  to  hill-climbing  local 
search  with  an  initial  assignment  generated  greedily,  i.e.  arrange  the  variables  in 
a  linear  order  and  assign  them  in  sequence  the  value  that  maximizes  the  weighted 
sum  of  satisfied  constraints.  Random  LS  refers  to  hill-climbing  local  search  with  a 
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random  initial  assignment.  Rand  Round  refers  to  our  simple  rounding  algorithm, 
and  RR  LS  refers  to  hill-climbing  local  search  with  an  initial  assignment  generated 
by  Rand  Round.  To  solve  a  semidefinite  program,  we  use  the  solver  written  by 
Fujisawa  and  Kojima  [5]. 


Experiment  1  In  the  first  set  of  experiments,  we  fix  n  =  64  and  k  =  2  and  gen¬ 
erate  random  satisfiable  instances  by  the  abovementioned  method.  We  consider 
three  edge  densities  (sparse,  medium  and  dense;  m  —  refers  to  the  total  num¬ 
ber  of  edges)  and  for  each  density,  we  vary  the  consistency  probability  from  0.1 
to  0.9.  For  each  case,  10  random  satisfiable  instances  are  generated  and  solved 
respectively  by  the  four  algorithms  The  respective  mean  approximation  ratios 
are  obtained.  Table  2  gives  the  outcome  of  the  experiment.  Figures  are  rounded 
to  three  decimal  places  and  l.OOO(-)  is  used  to  denote  a  value  which  is  rounded 
to  1.000. 


Experiment  2  In  the  second  set  of  experiments,  we  fix  n  =  20  and  k  =  5.  The 
same  scenerio  as  Experiment  1  is  repeated.  Table  3  gives  the  outcome  of  the 
experiment. 


Some  Observations  (Satisfiable  instances) 

1.  Greedy  LS  performs  well  on  dense  instances,  but  not  so  well  on  sparse  ones. 

2.  Random  LS  performs  reasonably  well  on  sparse  instances  but  not  so  well  on 
dense  instances. 

3.  Rand  Round  performs  consistently  well  on  all  instances,  achieving  at  least 
97%  optimality  for  =  2  and  86%  optimality  for  =  5.  More  importantly, 
Rand  Round  outperforms  Greedy  LS  and  Random  LS  for  k  =  5.  It  is  worth 
noting  that  the  average  CPU  time  required  by  our  implementation  of  Rand 
Round  is  22.2  seconds  for  each  instance  in  Experiment  1  and  10,5  seconds 
for  each  instance  in  Experiment  2. 

4.  RR  LS  outperforms  all  other  approaches  in  all  cases,  achieving  99%  optimality 
for  =  2  and  96%  optimality  for  A;  =  5.  It  is  worth  noting  that,  in  all 
cases,  the  standard  deviation  corresponding  to  each  mean  approximation 
ratio  is  smaller  than  those  of  Greedy  LS  and  Random  LS,  This  means  that  our 
algorithm  gives  consistently  good  approximation  solutions  for  the  instances 
tested. 


Experiments  3  and  In  the  third  and  fourth  sets  of  experiments,  we  generate 
non-satisfiable  instances  and  measure  the  absolute  ratios.  We  fix  ti  =  64,  k  ~2 
and  n  =  20,  A:  =  5  respectively.  The  same  scenerio  as  Experiment  1  is  repeated. 
Table  4  and  5  give  the  outcome  of  the  experiments. 
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Table  2.  Experiment  1. 


Table  3.  Experiment  2 


Table  4.  Experiment  3 
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Edge  Density  (q) 

Consistency  Prob  (A) 

1  Mean  absolute  ratios  j 

Greedy  LS 

Random  LS 

Rand  Round 

Sparse 

O.lO 

0.455 

0:489 

0.472 

0.503 

{2n/m) 

0.30 

0.752 

0.779 

0.773 

0.825 

0.50 

0.909 

0.915 

0.946 

0.963 

0.70 

0.977 

0.928 

0.989 

0.996 

0.90 

0.984 

0.871 

l.OOO(-) 

1.000 

Medium 

0.10 

■■  0,367 

0.390 

0307 

0.395 

(n  log  n/m) 

0.30 

0,659 

0.643 

0.623 

0.662 

0.50 

0.822 

0.848 

0.847 

0.870 

0.70 

0.956 

0.964 

0.956 

0.977 

0.90 

0.997 

0.994 

0.999 

1.000 

Dense 

OTU 

0:59s 

o:9Ti 

03S3 

ims" 

(n^  /3m) 

0.30 

0.597 

0.605 

0.583 

0.612 

0.50 

0.787 

0.794 

0.778 

0.809 

0.70 

0.933 

0.941 

0.925 

0.943 

0.90 

0.996 

0.945 

0.998 

1.000 

Table  5.  Experiment  4 


6  Conclusion 

In  this  paper,  we  have  proposed  a  new  approach  for  finding  good  approximate 
solutions  for  the  Weighted  CSP  (W-CSP).  This  method  is  based  on  a  recent 
breakthrough  in  randomized  algorithms  among  the  theoretical  Computer  Science 
community,  and  a  much-researched  area  of  semidefinite  programming  among  the 
Operations  Research  community.  Our  algorithm  runs  in  polynomial  time  in  the 
worst-case,  and  it  is  dependent  heavily  on  the  speed  of  solving  a  semidefinite 
program.  The  good  news  is  that  semidefinite  programs  are  solvable  quickly  in 
practice,  much  like  linear  programs,  and  much  research  is  going  on  in  the  Oper¬ 
ations  Research  community  to  develop  even  faster  algorithms  based  on  interior 
point  methods.  Another  advantage  of  our  algorithm  is  that  it  has  a  provable 
worst-case  bound  to  ensure  that  our  algorithm  will  never  perform  embarrass¬ 
ingly  poorly. 

Experimentally,  our  algorithm  works  well  for  satisfiable  W-CSP  instances 
drawn  random  from  a  distribution  parameterized  by  the  edge  probability  and 
consistency.  For  non-satisfiable  instances,  the  improvement  is  less  dramatic.  It 
remains  to  apply  our  algorithm  to  solve  real-world  instances. 

We  have  proposed  two  rounding  strategies  to  round  fractional  solutions  to 
valid  assignments.  This  opens  up  a  new  research  avenue  for  considering  other 
rounding  strategies  which  exhibit  both  good  worst-case  bound  as  well  as  empir¬ 
ical  performance. 
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Abstract.  E-GENET  shows  certain  success  on  extending  GENET  for 
non-binary  CSP’s.  However,  the  generic  constraint  representation  scheme 
of  E-GENET  induces  the  problem  of  storing  too  many  penalty  values  in 
constraint  nodes  and  the  min-conflicts  heuristic  is  not  efficient  enough  on 
some  problems.  To  overcome  these  two  weaknesses  and  further  improve 
the  performance,  we  propose  several  modifications.  All  of  them  together 
can  boost  the  efficiency  of  E-GENET  without  resorting  to  modifying 
the  underlying  network  model  or  the  convergence  procedure  in  an  ad 
hoc  manner.  The  performance  of  modified  E-GENET  also  compares  well 
against  that  of  CHIP. 


1  Introduction 

Many  problems  in  artificial  intelligence  and  computer  science  in  general  can  be 
formulated  as  constraint  satisfaction  problems  (CSP’s).  Efficient  algorithms  for 
solving  CSP’s  are  thus  very  useful.  In  1992,  Minton  et  al  published  a  paper  on 
a  new  approach  for  solving  CSP’s.  The  approach  is  known  as  heuristic  repair 
method  or  iterative  repair  method  [8].  Some  standard  problems  such  as  AT-queens 
and  graph-coloring  can  be  solved  in  orders  of  magnitude  better  than  traditional 
backtracking  techniques.  The  average  solution  time  for  the  million-queens  prob¬ 
lem  is  reduced  to  less  than  one  minute  and  a  half  on  a  SPARCstationl  [8]. 

A  problem  of  this  approach  is  that  execution  can  easily  be  trapped  in  local 
minima  (or  maxima),  a  state  in  which  no  repair  can  be  made  but  the  cur¬ 
rent  assignment  is  still  inconsistent.  When  trapping  occurs,  execution  has  to  be 
aborted.  This  situation  is  most  likely  to  occur  in  highly  constrained  problems  [2], 
especially  non-binary  CSP’s,  since  reassigning  one  variable  at  each  step  usually 
cannot  reduce  number  of  constraint  violations. 

In  a  previous  paper  [7],  we  present  the  E-GENET,  which  bases  on  iterative 
repair  approach,  features  a  generic  representation  scheme  for  general  constraints 
and  adopts  the  heuristic  learning  rule  from  GENET  [2].  Constraints  ranging 
from  disjunctive  constraints  to  non-linear  constraints  to  symbolic  constraints 
can  now  be  handled.  However,  being  a  first  step  for  solving  non-binary  CSP’s, 
E- GENET  has  two  insufficiencies.  For  a  complicated  constraint,  there  may  be 
a  large  number  of  penalty  values  stored  in  the  corresponding  constraint  node. 
Besides,  the  underlying  principle  of  E-GENET,  the  min-confiicts  heuristic  in 
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the  iterative  repair  approach,  cannot  provide  enough  information  to  guide  the 
search  efficiently  in  some  non-binary  CSP’s  and  hence  the  performance  is  not 
satisfactory. 

In  this  paper,  we  describe  several  modifications  to  E-GENET  such  as  a  new 
type  of  nodes  to  deal  with  the  problem  of  large  constraint  nodes  and  a  novel  as¬ 
signment  scheme  of  initial  penalty  values  to  improve  the  effectiveness  of  the  min- 
conflicts  heuristic  on  non-binary  CSP’s.  We  have  also  implemented  an  prototype 
to  show  the  feasibility  and  efficiency  of  our  proposal.  The  modified  E-GENET 
compares  well  against  CHIP  [5]  in  most  of  the  CSP’s  tested. 

The  rest  of  this  paper  is  organized  as  follows.  Section  2  briefly  reviews  E- 
GENET.  In  section  3,  we  explain  the  inadequacies  of  E-GENET.  Section  4  de¬ 
scribes  the  proposed  modifications.  Benchmarking  results  and  related  work  are 
presented  in  section  5  and  6  respectively.  Section  7  summarizes  our  contribu¬ 
tions  and  sheds  light  on  future  work. 

2  Brief  Review  of  E-GENET 

E-GENET  extends  GENET  for  general,  binary  and  non-binary,  constraint  han¬ 
dling  by  a  generic  constraint  representation  scheme.  It  consists  of  a  network 
model  and  a  convergence  procedure  based  on  min-conflicts  heuristic  in  iterative 
repair  approach  and  the  learning  heuristic  of  GENET. 

2.1  Network  Architecture 

E-GENET  has  two  types  of  nodes:  variable  nodes  and  constraint  nodes.  Each 
variable  in  a  CSP  is  represented  by  a  variable  node,  which  contains  the  domain 
associated  with  the  variable.  The  state  Sx  of  a  variable  node  x  is  defined  to  be 
the  current  variable  assignment.  A  constraint  node  is  created  for  each  constraint 
in  the  CSP.  A  variable  node  x  is  connected  to  a  constraint  node  c  if  rr  occurs  in 
c^.  Consider  the  CSP  ''x-\-y-\-z  =  9A3y-z  =  4,”  where  the  domains  of  x, 
y,  and  z  are  {!,...,  10}.  Figure  1  shows  the  CSP’s  network  in  E-GENET.  The 
constraint  node  for  —  2  =  4  is  connected  to  relating  variable  nodes  y  and  2:; 
and  the  constraint  node  for  x  +  y  -h  2:  =  9  is  connected  to  all  of  a;,  y,  and  The 
current  state  of  the  network  represents  the  following  variable  assignment:  x  =  3, 
2/  =  2,  and  z  —  4. 

For  any  constraint  c(xi^ , . . . ,  ),  each  combination  (or  tuple)  (ui , . . . ,  of 

possible  values  from  domains  of  . . .  ,Xi„  is  given  a  penalty  value 
These  penalty  values,  (conceptually)  stored  in  the  corresponding  constraint  node, 
may  be  modified  as  a  result  of  heuristic  learning.  Initially,  penalty  values  of 
prohibited  tuples  are  set  to  —1  and  others  to  0. 

Each  value  v  in  the  domain  of  variable  x  has  an  input  Ix^v  defined  as: 

Ix-V  =  ^ 


^  We  relax  terminology  by  naming  a  variable  (a  constraint)  also  by  its  variable  (con¬ 
straint)  node  name. 
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x+y+z  =  9  3y-z  =  4 


X  y  z 


Fig.  1.  An  example  network  generated  by  E-GENET 


A  negative  input  indicates  that  the  value  is  involved  in  the  violation  of  some 
constraints  in  the  network.  When  the  sum  of  input  of  values  in  the  current 
variable  assignment  is  zero,  no  constraints  are  violated  and  the  network  is  in 
a  solution  state.  The  variable  assignment  associated  with  a  solution  state  is  a 
solution  to  the  corresponding  CSP. 

2.2  Convergence  Procedure 

Dynamics  of  E>GENET  concerns  how  the  network  changes  states  of  variable 
nodes  and  penalty  values  in  constraint  nodes  before  settling  in  solution  state{s). 
Initially,  a  complete  but  possibly  inconsistent  variable  assignment  is  generated. 
The  value  of  each  variable  node  x  is  then  updated  to  reduce  constraint  violation 
by  choosing  the  value  with  maximum  input: 

X  :=  v'  if  Ix=v'  =  max{/a:=^,|t;  G  domain  of  a;} 

If  there  are  several  values  with  the  maximum  input,  a  value  with  maximum  input 
that  is  currently  assigned  to  x  stays  assigned.  Otherwise,  pick  a  value  with  max¬ 
imum  input  randomly.  Following  [2],  we  define  a  repair  to  be  a  variable  update. 
Variables  in  the  network  are  updated  asynchronously  until  variable  assignment 
remains  unchanged. 

E-GENET  terminates  execution  if  a  solution  state  is  reached.  Otherwise,  the 
network  is  trapped  in  a  local  minima  and  heuristic  learning  is  activated  to  help 
escape  from  the  local  minima.  Heuristic  learning  in  E-GENET  amounts  to  de¬ 
creasing  the  penalty  values  of  tuples  violating  some  constraints.  We  leave  the 
heuristic  learning  rule  unspecified  intentionally  since  application  domain  knowl¬ 
edge  can  usually  assist  us  in  designing  good  learning  rule  for  specific  problems. 
The  convergence  procedure  is  summarized  in  figure  2. 

3  Inadequacies  of  E- GENET 

3.1  Cumbrous  Constraint  Node 

Consider  the  constraint  13a;i-f4a;2 +6x3  4-92:4  =  72/i-j-8?/2  where  all  variables  have 
the  domain  {1, . . . ,  100}.  Since  there  are  100®  possible  tuples,  we  have  to  store 
10^^  penalty  values  in  the  corresponding  constraint  node  and  this  is  impractical. 
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repeat 

update  all  variable  nodes  asynchronously  until  no  repair 
if  (sum  of  inputs  of  values  in  current  assignment  is  zero) 
terminate  with  success 
else 

activate  heuristic  learning 

until  (time  limit  or  maximum  number  of  cycles  is  exceeded) 
Fig.  2.  The  convergence  procedure  of  E-GENET 


If  we  break  down  the  constraint  into  13rri  +  4x2  +  ^^3  =  T  and  T  -f  90:4  = 
4.  g^2  by  using  a  new  variable  T,  the  domain  of  T  after  pruning  would 
be  {22, . . . ,  1491}.  The  number  of  possible  tuples  for  these  two  constraints  is 
then  2  x  1470  x  100^,  i.e.  2.94  x  10^.  Although  we  can  successfully  lower  the 
storage  requirement,  the  performance  would  be  affected  by  the  addition  of  a  new 
variable  of  relatively  large  domain.  Further  decomposition  of  the  two  constraints 
can  cut  the  space  requirement  to  a  greater  extent  but  induces  severe  drawback 
in  performance.  So  this  is  not  a  good  method  to  solve  the  problem.  What  we 
need  is  something  that  has  properties  similar  to  T  but  does  not  increase  the 
number  of  variables. 


3.2  Inefficiency  of  the  min-conflicts  heuristic 

In  this  section,  we  extend  the  model  in  [8]  to  investigate  the  efficiency  of  E- 
GENET.  Consider  a  CSP  with  variables  where  each  Xi  has  k  possible 

values  and  involves  exactly  in  c  m-ary  constraints.  Assume  that  there  is  only 
one  solution  (?;i, . . .  and  the  randomly  generated  initial  assignment  has  d 
variables  assigned  different  from  the  solution.  Denote  the  set  of  these  d  variables 
as  Var.  For  any  constraint  c'{xi^ , . . . ,  Xi^ )  in  the  CSP,  if  3j  such  that  Sxi.  ^  vi- , 
let  the  probability  of  the  constraint  being  violated  be  p. 

Randomly  choose  a  variable  Xk  ^  Var  for  repairing  (the  case  for  Xk  ^  Var 
is  similar).  For  any  one  of  the  k  —  1  incorrect  values  (^  Vk)  of  Xk,  since  all  c 
related  constraints  have  a  probability  p  of  being  violated,  the  expected  number 
of  constraint  violations  is  pc  and  hence  the  expected  input  to  these  values  is  —pc. 

Consider  the  correct  value  of  Xk‘  For  an  arbitrary  constraint  depending  on 
the  variable,  the  probability  that  all  other  m  -  1  variables  are  not  elements  of 
Var  is  The  probability  of  the  constraint  being  violated  is  thus 

(^Tt)/(m-i))  P  expected  input  to  the  correct  value  of  Xk  is  given 

by-(l-M/(r-\))pc. 

From  the  expected  inputs,  we  can  see  that  the  probability  of  making  an 
incorrect  repair  would  be  decreased  if  c  increases  or  d  shrinks.  In  other  words, 
if  number  of  constraints  is  small  or  d  is  very  near  to  n  (d  >  n  ~  m)  at  the 
beginning,  the  efficiency  of  E-GENET  may  be  very  low. 

This  result  can  also  be  observed  from  the  benchmarking  results  of  E- GENET 
in  our  previous  paper  [7].  We  find  that  in  some  non-binary  CSP’s  like  systems  of 
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linear  equations  and  cryptarithmetic  problems,  the  performance  of  E-GENET 
is  worse  than  that  of  CHIP.  Use  the  problem  in  figure  3  as  an  example.  Here, 
m  =  n  =  c  =  A  and  p  is  equal  to  1  approximately.  If  we  try  to  repair  a  variable 
that  is  assigned  incorrectly  at  start,  in  almost  all  cases,  the  input  to  any  one 
of  incorrect  values  of  the  variable  is  -4  and  that  to  the  correct  value  is  -4 
as  well  Hence,  variables  are  updated  randomly  and  the 

network  would  wander  over  all  possible  states  aimlessly  until  there  are  sufficient 
information  provided  by  learning. 


'  691a:i  +  81a;2  +  220:3  +  629a:4  =  7007 

519a;i  +  147x2  -  971x3  -  710x4  =  -8726 
^  841x1  -  527x2  -h  948x3  -  589x4  = -4357 
899x1  -H  343x2  -  877x3  +  531x4  =  4571 

^  0  <  Xi,X2,X3,X4  <  10 

Fig.  3.  A  system  of  linear  equations 


4  Modifications 

To  overcome  the  two  weaknesses  of  E-GENET,  we  propose  four  modifications. 
All  of  them  together  can  boost  the  performance  of  E-GENET  without  resorting 
to  modifying  the  underlying  network  model  or  the  convergence  procedure  in  an 
ad  hoc  manner. 

4.1  Intermediate  Node 

The  first  modification  is  the  introduction  of  a  new  type  of  nodes  called  inter¬ 
mediate  nodes  to  address  the  problem  of  cumbrous  constraint  nodes.  Figure  4 
shows  a  general  representation  for  a  constraint  with  this  modification. 


Fig.  4.  General  representation  for  a  constraint  after  the  addition  of  intermediate  nodes 


Formally,  a  representation  for  a  constraint  on  n  variables  {xi ,...,  a;n}  is  a  di¬ 
rected  acyclic  graph  {dag)  with  at  least  n-l-1  nodes.  One  of  the  nodes  is  called  con¬ 
straint  node,  denoted  by  c'.  We  have  indegree{d)  =  0  and  outdegree{c') 

^  indegree(Y)  is  the  number  of  edges  entering  node  Y  and  outdegree(Y)  is  that  leaving. 
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of  the  other  nodes  are  called  variable  nodes.  There  exists  a  one-one  mapping  be¬ 
tween  variables  and  variable  nodes.  Thus  we  relax  terminology  and  give  a  variable 
and  its  corresponding  variable  node  the  same  name.  For  all  Xi^  indegree{xi)  >  1 
and  outdegree{xi)  =  0. 

All  other  nodes,  not  in  {c',  ari, . . .  are  intermediate  nodes.  Let  the  set  of 
intermediate  nodes  be  {/i , . . . ,  /m}-  Each  intermediate  node  fi  has  the  properties 
that  indegree(fi)  >  1  ajid  outdegree{fi)  >  1.  Define  outbundle{fi)  as  the  set 
{ai,...,ap}  such  that  for  each  aj  in  the  set,  there  is  an  edge  from  fi  to  aj. 
Intermediate  node  fi  is  associated  with  a  function  defined  on  its  outbundle, 
. . . ,  ap).  The  state  5/.  of  fi  is  the  value  Fi{Sai , . . . ,  5^^)  and  /^’s  domain, 
dom{fi),  is  the  range  of  the  function  Fi.  In  constraint  node  c',  we  store  penalty 
values  for  combinations  of  values  from  domains  of  nodes  in  outbundle{c')  instead. 

Consider  the  constraint  x^+x'^y  -  4.  Figure  5  shows  a  possible  representation 
for  this  constraint.  The  intermediate  node  a  is  associated  with  the  function  x'^ 
and  currently  has  the  value  (state)  4.  If  a;  €  {1,2}  and  y  €  {3,4},  after  the 
addition  of  nodes  a  and  6,  the  content  of  the  constraint  node  is  changed  as 
shown  in  the  figure. 


a  +  b  =  4  (x^+  x^y  =  4) 


a  b 

6  a+b=4 

1  3 

0 

1  4 

-1 

4  12 

-1 

4  16 

-1 

X  y 

Fig.  5.  An  example  representation  with  intermediate  nodes 


Let  the  outbundle  for  constraint  node  c'  of  the  constraint  c(xi, . . .  ,Xn)  be 
{/i, . . . ,  /p}.  Since  for  each  combination  (ui, . . . ,  Un)  of  possible  values  from  do¬ 
mains  of  xi, . . . , iTn,  there  is  a  corresponding  tuple  (5/^ , . . . ,  5/^),  a  mapping 
H  :  xi  X  ...  X  Xn  ^  B  where  B  C  dom(/i)  x  . . .  x  dom(fp)  exists  and  range 
oi  H  —  B  (surjective).  Each  tuple  in  B  is  given  a  penalty  value  and  all  these 
penalty  values  form  the  content  of  c'. 

There  is  one  restriction  on  usage  of  intermediate  nodes.  For  each  (ui, . . . ,  Up) 
in  J9,  all  tuples  in  ...^Up)  must  be  either  satisfying  or  violating  the  con¬ 

straint  unanimously.  If  all  these  tuples  satisfy  the  constraint,  the  initial  penalty 
value  for  (ui, . . .  ,Up)  is  0.  Otherwise,  it  is  set  to  -1.  The  heuristic  learning  is 
similar  to  that  in  the  original  E-GENET.  For  example,  a  plausible  learning  rule 
is  “  (<^c'(5'/j  ,...,5/p)  <  0)- 

The  most  essential  usage  of  intermediate  nodes  is  to  reduce  the  number  of 


^  <  is  a  boolean  function  returning  1  if  the  comparison  is  true  and  0  otherwise. 
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nodes  in  out  bundle  of  the  constraint  node  and  the  size  of  domains  of  these 
nodes,  i.e.  the  size  of  B.  Although  we  allow  any  levels  of  intermediate  nodes,  one 
is  sufficient  to  meet  this  objective. 

Theorem  Given  a  representation  M  for  a  constraint  c{xi,. .  .,Xn),  there  exists 
another  representation  M'  such  that  all  paths  in  M'  are  of  length  <  2  and  the 
content  of  the  constraint  node  in  M'  is  the  same  as  that  in  M. 

The  philosophy  behind  intermediate  nodes  is  to  divide  tuples  of  a  constraint 
into  several  groups  (using  the  function  H).  Each  group  (a  tuple  in  B)  is  given 
only  one  penalty  value  and  during  learning,  we  would  treat  a  group  as  a  unit 
and  penalize  a  whole  group. 

Consider  again  the  constraint  13x1  +  4x2  +  Sxs  +  9x4  =  7yi  +  82/2.  It  can 
be  represented  as  shown  in  figure  6.  Intermediate  nodes  a  and  b  represent  ex¬ 
pressions  at  both  sides  of  the  constraint  respectively.  Each  distinct  combination 
{‘^a^Vb)  calculated  with  possible  values  from  domains  of  xi,X2,X3,X4,?/i,y2  is 
given  one  penalty  value.  When  heuristic  learning  is  activated,  if  Sa  5^,  we 
would  decrease  the  penalty  value  <^{a=6)(5a,S6)  by  1.  The  pair  (60,50)  in  the 
new  constraint  node,  for  example,  represents  the  set  of  tuples  {(1, 1, 5,  2, 6, 1), 
(1, 2, 6, 1,6, 1),  (1,6, 1,2, 6, 1),(1,  7, 2, 1,6, 1),(2, 5, 1, 1,6, 1)}  in  the  original  node. 
Updating  the  penalty  value  of  the  pair  is  equivalent  to  performing  the  same 
operation  on  penalty  values  of  all  tuples  in  the  set. 


Xi  X2  X3  X4  yj  y2 

Fig.  6.  A  possible  representation  for  the  constraint  13xi  -1- 4a; 2  +  5a: 3  -f- 9^4  =  7yi  +8?/2 


Under  this  representation,  the  storage  requirement  for  the  constraint  becomes 
4  X  10®  (compared  to  10^^  original  one).  If  we  divide  all  possible  tuples 

into  fewer  groups,  we  can  save  more  space.  However,  performance  would  be 
affected.  Decreasing  the  penalty  value  of  a  tuple  during  learning  is  to  reduce  the 
probability  that  this  tuple  is  assigned  to  corresponding  variables  again.  With 
grouping,  changing  one  penalty  value  would  influence  probabilities  of  all  tuples 
in  the  group  and  this  effect  may  not  be  desirable.  To  avoid  any  decrease  in 
performance,  we  carefully  design  a  suitable  grouping  for  each  type  of  constraints. 
Moreover,  in  practice,  only  penalty  values  of  tuples  that  have  been  penalized 
during  learning  would  be  stored.  The  others  would  be  computed  on  demand. 

Besides  reducing  size  of  constraint  node,  intermediate  node  has  two  more 
advantages: 
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Eliminating  Common  Subexpression:  In  each  constraint  node,  if  we  do  not  store 
all  penalty  values  directly  and,  instead,  derive  some  storage  scheme  to  reduce 
storage  requirement,  we  usually  need  to  do  constraint  checks.  Intermediate  nodes 
can  be  used  to  store  the  results  of  common  subexpressions  and  eliminate  the  need 
of  re-calculation.  The  node  a  in  figure  7  removes  redundant  work  on  both  inter- 
and  intra-constraint  common  subexpression. 


a  +  y  =:z(x2+y  =  z)  a  +  b  =  4(x2+x2y  =  4) 


Fig.  7.  An  example  network  with  the  common  subexpression 


Unifying  Heuristic  Learning:  The  heuristic  learning  rule  is  not  fixed  in  original 
E-GENET.  During  learning,  for  any  constraint  c{xi ,...,  a^n) ?  if  )  <  0? 

users  can  choose  a  F  where  . . .  ,5a;„)}  CPC  {(ui, . . .  <  0} 

and  decrease  penalty  values  of  all  tuples  in  P  by  1.  As  most  advantages  of  this 
practice  can  be  obtained  by  using  intermediate  nodes,  heuristic  learning  rules 
can  be  unified  with  P  =  {{Sxi ,  •  •  • ,  )}. 

Use  the  constraint  atmost(l,  {a;i,a;2,a;3,a;4},  {3})  as  an  example.  This  con¬ 
straint  states  that  at  most  one  out  of  the  four  variables  can  take  the  value  3. 
We  usually  choose  P  =  .  •  ♦ ,  <  0}-  Constructing  the  repre¬ 

sentation  as  in  figure  8,  we  can  use  the  learning  rule  (5(a=o)(5a)  •—  <^(a=o)(5a)  “ 
(<^{a=o)(5a)  <  instead  to  get  the  same  effect. 


a  =  0(atmost(l,  {xi,  X2,  X3,  X4},  {3})) 


Fig.  8.  The  representation  for  the  atmost  constraint 


4,2  New  Assignment  Scheme  of  Initial  Penalty  Values 

E-GENET  is  based  on  the  min-confiicts  heuristic,  which  favors  the  value  violat¬ 
ing  the  minimum  number  of  constraints.  Before  learning  procedure  is  activated, 
the  input  to  a  value  is  -n  where  n  is  number  of  constraints  violated  by  assigning 
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this  value  to  the  corresponding  variable.  E-GENET  would  then  change  the  value 
of  the  variable  to  the  one  that  has  the  maximum  input. 

If  the  network  is  trapped  in  a  local  minima,  heuristic  learning  is  invoked. 
This  amounts  to  decreasing  the  penalty  values  of  tuples  violating  some  con¬ 
straints.  Assume  that  in  a  particular  learning  process,  the  tuple  (ui , . . . , 
of  a  constraint  c(a:i, . . .  ,2;^)  is  penalized.  This  can  be  considered  as  adding 
a  redundant  constraint  illegal((2;i , , . . ,  a;^),  (ui , . . . ,  u^)),  which  specifies  that 
(xi , . . . , Xn)  {vi,. . .  ,Vn):  decreasing  the  penalty  value  is  effectively 

the  same  as  combining  the  content  of  the  constraint  node  with  that  of  the 
illegal  constraint. 


2:1 . 

new  —  old  <^c(xi,...,xn)  +  <^illegal((xi,...,x„),(vi,...,t;„)) 

Vl  - 

.  .  Vn 

:  0 

-2  -1  ~1 

:  0 

Hence  decreasing  penalty  values  can  be  interpreted  as  the  accumulation  of 
knowledge  by  introducing  redundant  constraints.  It  is  well  known  that  domain 
knowledge  or  some  other  information  can  be  utilized  to  guide  the  search  by 
adding  appropriate  redundant  constraints.  According  to  the  the  above  discus¬ 
sion,  we  need  not  create  new  constraint  nodes  for  these  redundant  constraints 
in  the  E-GENET  model.  What  we  need  is  simply  a  new  assignment  scheme  of 
initial  penalty  values:  for  any  constraint,  initial  penalty  values  of  tuples  satisfy¬ 
ing  the  constraint  are  O’s,  but  those  for  prohibited  tuples  may  be  any  negative 
integers  determined  by  the  user  using  his  domain  knowledge.  ^  This  resembles 
the  fitness  in  “Evolutionary  Model”  of  GENET  [11]. 


4.3  Concept  of  Contribution 


Consider  a  constraint  c(a;i , . . . ,  Xn)^  If  the  current  assignments  of  Xi, . . .  ,Xn  are 
, . . . ,  Un  respectively,  is  always  used  in  computing  any  input  Ixi=vi , 

where  1  <  ^  <  n.  However,  this  method  is  inadequate  in  some  cases.  Take  the 
constraint  atmost(l,  {xi ,  2:2, 2:3, 2:4},  {3})  and  the  tuple  (3, 4, 3, 3)  as  an  example. 
That  2^2  being  assigned  the  value  4  does  not  contribute  to  the  violation  of  the 
constraint.  The  penalty  value  of  the  tuple  should  not  be  added  to  the  input  1x2=4 
to  decrease  the  probability  that  X2  takes  the  value  4.  Thus,  the  definition  of  an 
input  is  given  as: 


Ix=V  - 


contributedj^  (S, 


xij }  •  •  • ,  'y, 


where  the  argument  of  (Sxi^ , . . . ,  u, . . . ,  Sxi^ )  is  v;  contributedi,  (ui , . . . ,  u^)) 
is  a  function  returning  a  value  between  0  and  1.  The  magnitude  of  the  value  shows 
the  contribution  of  Vi  to  the  tuple  (t>i, . . . ,  with  respect  to  the  constraint  c. 

^  It  should  be  noted  that  any  finite  domain  constraints  can  be  expressed  as  a  conjunc¬ 
tion  of  illegal  constraints. 
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4.4  Learning  Heuristic 

As  now  the  initial  penalty  value  of  a  tuple  can  be  much  smaller  than  —1,  the 
landscape  of  the  search  space  becomes  comparably  rough.  A  penalty  amount 
of  —1  may  not  be  sufficient  to  unstable  the  network.  So  if  the  same  tuple  of  a 
constraint  is  penalized  consecutively,  penalty  amount  is  increased  exponentially 
at  each  time  of  learning  (-1,-1,  —2,  —4,  —8, . . .). 

5  Benchmarking  Results 

To  illustrate  the  feasibility  and  effectiveness  of  the  modifications,  we  have  imple¬ 
mented  several  types  of  constraints  and  test  each  of  them  on  different  problems. 
We  compare  our  result  with  that  of  a  constraint  logic  programming  language 
Cosytec  CHIP  version  4.1.0  [9],  which  uses  traditional  constraint  propagation 
and  backtracking  tree  search  for  constraint  solving. 

All  benchmarking  is  performed  on  a  SUN  SPARCstation  10  model  30.  Timing 
(including  network  construction)  and  number  of  repairs  results  for  both  versions 
of  E-GENET  are  median  of  10  runs.  For  each  problem,  median  number  of  penalty 
values  stored  in  the  modified  E-GENET  is  also  shown.  A  symbol  means  that 
the  execution  fails  due  to  either  execution  time  exceeded  or  memory  exhaustion. 

5.1  Linear  Arithmetic  Constraint 

A  linear  arithmetic  constraint  is  a  constraint  of  the  form  U  AU ,  where  U  and  V 
are  linear  arithmetic  expressions  and  A  £  <><>>,>}•  The  representation 

of  the  constraint  is  shown  in  figure  9.  Intermediate  nodes  a  and  b  are  used  to 
hold  the  current  values  of  U  and  U.  The  initial  penalty  values  for  violating  tuples 
are  mainly  based  on  the  difference  between  values  of  a  and  6. 


Fig.  9.  The  representation  for  a  linear  arithmetic  constraint 


Five  traditional  benchmark  programs  [3]  have  been  used  to  show  the  ability 
of  the  modified  E-GENET  on  solving  linear  equation  problems,  send,  donald 
and  crypt  a  are  crypt  arithmetic  problems  of  different  size.  eqlO  and  eq20  are 
systems  of  10  and  20  linear  equations  respectively. 

Since  constraint  propagation  alone  can  solve  the  cryptarithmetic  problems 
with  little  backtracking,  CHIP  outperforms  both  versions  of  E-GENET  signifi¬ 
cantly  in  most  cases.  However,  we  can  observe  that  the  proposed  modifications 
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Problem 

CHIP 

CPU  Time  (sec) 

Original  E-GENET 
CPU  Time  (sec) 

Modified  E-GENET 
CPU  Time  (sec) 

penalty  values 
stored 

send 

0.010 

0.085 

53 

donald 

0.050 

2.895 

0.380 

166 

crypta 

0.050 

3.400 

1.040 

629 

eqlO 

0.140 

- 

0.045 

0 

eq20 

0.210 

- 

0.040 

0 

Table  1.  Results  on  crypt  arithmetic  problems  and  systems  of  linear  equations 


can  improve  the  performance  of  E-GENET.  The  modified  E-GENET  is  more 
efficient  than  CHIP  in  the  problems  eqlO  and  eq20,  which  cannot  be  solved 
within  10  minutes  by  the  original  version. 


5.2  Almost  Constraint 

The  ectmost {N,Var,Val)  constraint  specifies  that  no  more  than  N  variables 
taken  from  the  variable  set  Var  are  assigned  values  in  value  set  Val.  Assume 
n  is  the  number  of  variables  currently  having  values  in  Var.  If  n  >  TV,  we 
would  prefer  a  smaller  n.  Take  the  constraint  atmost(l,  {a;i,a:2, 2:3,  X4},  {3})  as 
an  example.  The  assignment  (1,3, 3, 2)  for  variable  tuple  (a:i,a;2,a:3,a:4)  is  better 
than  (3,4, 3,3)  as  we  only  need  to  change  the  value  of  one  variable  to  get  the 
constraint  satisfied. 

To  utilize  this  information,  we  use  the  representation  in  figure  8  with  the 
intermediate  node  a  changed  for  storing  the  number  of  variables  currently  as¬ 
signed  values  in  Val.  The  initial  penalty  value  for  each  violating  tuple  is  given 
by  the  formula  N  —  a  as  follows: 


a 

S' 

"atmost 

0 

0 

N 

0 

N  +  1 

-1 

|Uar| 

N  -  |Uar| 

where  \Var\  is  the  cardinality  of  Var.  This  can  be  regarded  as  adding  the  set  of 
redundant  constraints  {atmost(iV  +  1,  Var,  Val),  atmost(A/'  +  2,  Var,  Val), . . 
atmost(|Far|  -  l,Var,Val)].  Here  we  set  ccmtrihuteatmost{i,{vi, . . .  ,Vn))  = 
{vi  e  Val),  where  €  is  a  function  returning  1  if  value  Vi  is  in  Val  and  0  otherwise. 

We  compare  our  implementations  with  CHIP  on  the  car-sequencing  problem 
which  involves  scheduling  cars  onto  an  assembly  line  so  that  different  options 
can  be  installed  on  these  cars  satisfying  various  utilization  constraints  [4].  50 
problems  ®  are  tested,  10  for  each  utilization  percentage  in  the  range  60%  to 
80%.  Using  the  method  described  in  [4],  CHIP  can  only  manage  to  solve  6  out  of 

^  We  thank  Andrew  Davenport  for  supplying  the  car-sequencing  benchmarks. 
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50  problems  within  one  hour.  We  have  also  tested  the  performance  of  CHIP  with 
the  new  method  in  [1].  There  is  not  much  improvement.  For  the  two  versions  of  E- 
GENET,  the  execution  limit  is  set  to  1000  repairs  and  the  results  are  summerized 
in  table  2.  The  modified  E-GENET  can  terminate  in  less  than  10  minutes  for  all 
500  runs,  requiring  no  more  than  1800  repairs  (except  in  one  run).  In  general, 
there  are  increases  of  15%  to  20%  in  percentages  of  successful  runs. 


utiliza¬ 
tion  % 

Original  E- GENET 
%  succ.  runs  median  repair 

Modified  E-GENET 
%  succ.  runs  median  repair 

penalty  values 
stored 

60 

74  223.5 

100  282.5 

29 

65 

80  223.5 

99  262 

20 

70 

81  241 

100  280.5 

30 

75 

84  339 

97  331 

74 

80 

53  576 

73  537 

187 

Table  2.  Results  on  car  sequencing  problems 


5.3  Disjunctive  Constraint 

To  handle  a  disjunctive  constraint  Ci  V  C2  V . . .  V  Cn  in  the  modified  E-GENET, 
we  can  use  the  representation  in  figure  10.  For  each  constraint  Ci,  there  is  a 
corresponding  intermediate  node  which  holds  the  initial  penalty  value  for  the 
associated  tuple  under  the  assignment  scheme  of  Ci.  Actually,  can  be  regarded 
as  the  degree  of  violation  of  Ci,  Since  we  only  need  one  of  them  satisfied,  we 
add  one  more  intermediate  node  h  to  store  the  maximum  value  among  all  aj’s. 
The  assignment  scheme  of  the  disjunctive  constraint  would  then  be  based  on 
the  value  of  h.  Use  the  constraint  x  ~  y  z  ^  x  =  2y  z  example. 

The  representation  is  given  in  figure  11.  It  is  constructed  with  also  techniques 
described  in  section  5.1. 


b==0(CiVC2V...vC„) 


b 

SC1VC2V  .  .  .VCn 

Vi 

Vi 

V2 

. 

. 

. 

« 

• 

• 

Fig.  10.  The  representation  for  a  disjunctive  constraint 


The  Hamiltonian  path  problem  is  used  to  test  this  handling  method  of  dis¬ 
junctive  constraint.  The  problem  can  be  summarized  as  follows:  given  a  graph 
of  n  vertices,  we  have  to  find  an  ordering  of  these  n  vertices  (vi, ^2, .  • . ,  Vn)  so 
that  for  alH,  1  <  ^  <  n,  there  is  an  edge  between  Vi  and  Vi^i.  We  formulate  the 
problem  as  in  [7].  In  CHIP  and  the  original  E-GENET,  one  kind  of  redundant 
constraints  is  used  to  speed  up  the  execution.  We  omit  these  constraints  in  the 
modified  E-GENET  intentionally  to  test  the  efficiency  of  our  proposal. 
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b  =  0  (x  =  y+zVx  =  2y-z) 


y  z 


Fig.  11.  The  representation  for  the  constraint  x  =  y-\-zyx  =  2y  —  z 

The  benchmarking  results  are  listed  in  table  3.  CHIP  uses  choices  [10]  to 
model  disjunctive  constraints.  This  approach  induces  a  large  number  of  back¬ 
tracking  and  fails  to  solve  a  graph  with  50  vertices  in  10  hours.  With  more 
information  provided  by  penalty  values,  the  modified  E-GENET  is  several  times 
faster  than  the  previous  version. 


graph 

node 

CHIP 

CPU  Time  (sec) 

Original  E-GENET 
CPU  Time  (sec) 

Modified  E-GENET 
CPU  Time  (sec) 

penalty  values 
stored 

30 

581.640 

54.000 

6.550 

215 

40 

4023.500 

587.158 

29.465 

350 

50 

- 

4897.092 

783.790 

2943 

Table  3.  Results  on  Hamiltonian  path  problems 


5.4  Cumulative  Constraint 

The  cumulative  constraint  [1]  in  CHIP  is  found  to  be  useful  as  some  real  world 
problems  like  scheduling  and  placement  problems  can  be  stated  more  easily 
and  directly.  Thus,  we  implement  it  in  modified  E-GENET.  The  constraint  has 
the  form  ciimulative([Oi, . . . ,  O^j,  [A, . . .  ,11^],  [i?i, . . . ,  i?^],  L)  and  can  be 
explained  with  a  simple  scheduling  problem  of  m  tasks  and  one  kind  of  resources. 
We  can  treat  Oi , . . . ,  Om  as  starting  times  for  the  tasks.  Each  task  i  uses  Ri 
amount  of  resources  and  lasts  for  Di  units  of  time.  The  constraint  holds  if  at 
any  time  unit  t,  resources  required  are  less  than  L,  J2i\Oi<t<Oi+D  -i 

To  handle  the  constraint,  we  can  break  it  down  into  several  inequality  con¬ 
straints  and  apply  the  techniques  described  in  section  5.1.  Assume  a{V)  be  the 
minimum  value  in  the  domain  of  variable  V  and  P{V)  the  maximum.  Let  t'  = 
min{a{Oi), . . .  ,a(0„)}  and  t"  =  max{0(Oi)  +  0{Di), . . .  ,/3(0„)  +  0{D^)}. 
For  each  time  unit  t  between  t'  and  t" ,  we  use  an  intermediate  node  to  hold  the 
resources  required  at  that  unit.  In  figure  12,  the  intermediate  node  ai  is  for 
a2  for  t'  1,  and  so  on.  Then  we  need  to  solve  n  constraints  of  ai  <  L,  where 
n  =  t"  -t'  +  1. 
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ai<L  a2<L  an<L 


Fig.  12.  The  representation  for  the  cumulative  constraint 

Simple  scheduling  problems  of  different  number  of  tasks  and  resources  uti¬ 
lization  percentage  (three  problems  for  each  setting)  are  used  as  testing  examples 
and  the  results  are  shown  in  table  4.  CHIP  cannot  solve  two  problems  with  21 
tasks  within  1  hour.  The  performance  of  the  original  E- GENET  is  extremely 
poor  as  there  is  only  one  constraint  in  each  problem.  However,  with  the  modifi¬ 
cations,  the  new  version  of  E- GENET  can  solve  all  problems  efficiently. 


task 

no. 

utiliza¬ 
tion  % 

CHIP 

CPU  Time  (sec) 

Modified  E-GENET 
CPU  Time  (sec) 

penalty  values 
stored 

7 

80 

0.020 

QQQj] 

0.010 

0.010 

0.000 

0.000 

0 

o 

0 

85 

0.025 

0.010 

0.040 

0.000 

0 

o 

0 

i™ 

0.015 

QQQ 

0.095 

0.010 

0.000 

49 

B 

0 

0.015 

QQQ 

fiiinn 

0.050 

0.010 

0.010 

B 

5 

14 

80 

0.205 

0.015 

0.090 

0.095 

0.020 

3 

B 

1 

85 

0.030 

Ugiggl 

0.030 

12 

B 

2 

90 

4.630 

ygigg 

RWiCTI 

0.030 

wm 

m 

2 

95 

43.920 

11.185 

13.950 

0.125 

IBO 

25 

21 

80 

0.030 

tSIiSSI 

0.035 

0.115 

0.140 

0.045 

1 

B 

0 

85 

0.055 

0.045 

Riling 

0.435 

0.265 

0.080 

13 

4 

2 

90 

1269.560 

RiiEa 

2.420 

0.915 

0.135 

mm 

9 

95 

- 

69.460 

16.780 

0.380 

gRH 

23 

Table  4.  Results  on  simple  scheduling  problems 


6  Related  Work 

Davenport  et  al  [2]  proposed  an  extension  to  GENET  for  non-binary  constraints, 
in  which  a  hyper-edge  is  used  to  link  up  incompatible  labels  in  an  n-ary  con¬ 
straint.  Our  approach  is  different  from  theirs  in  three  aspects: 

1.  In  E-GENET,  all  constraints,  binary  or  non-binary,  have  the  same  status 
and  are  represented  homogeneously,  while  they  are  much  smaller  in  size 
than  their  counterparts  in  the  scheme  of  Davenport  et  al.  For  example,  for 
the  constraint  x+y  =  u-\~v  where  all  variables  have  the  domain  {!,...,  100}, 
their  scheme  needs  around  100^  constraint  nodes  whereas  only  3  nodes  are 
needed  in  E-GENET  and  penalty  values  can  be  computed  on  demand. 

2.  The  cost  of  network  construction  in  E-GENET  is  low  compared  with  that 
for  the  extension  of  Davenport  et  al  as  huge  amount  of  work  on  building 
connections  is  removed.  For  the  above  example,  4  x  100^  connections  are 
required  in  their  scheme  but  we  can  just  use  6  in  E-GENET. 
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3.  E-GENET  also  provides  much  greater  flexibility.  New  types  of  constraints 
and  domain  knowledge  can  be  incorporated  easily,  not  the  case  for  their 
scheme,  by  using  intermediate  nodes  and  suitable  assignment  schemes  of 
initial  penalty  values.  As  a  result,  better  performance  can  be  obtained. 

7  Conclusion 

In  this  paper,  several  modifications  to  E-GENET  are  proposed.  The  new  ver¬ 
sion  of  E-GENET  has  three  advantages  over  the  original  one.  First,  memory 
requirement  is  largely  reduced.  Second,  there  is  a  great  increase  in  performance, 
especially  on  highly  constrained  problems.  Third,  some  very  complex  constraints 
can  now  be  handled.  Both  versions  of  E-GENET  outperform  CHIP  on  CSP’s 
requiring  much  backtracking  tree  search.  However,  more  experiments  have  to  be 
done  to  compare  these  two  approaches  in  different  regions  of  the  problem  space. 

Interesting  future  work  includes  applying  the  cumulative  and  the  diffn 
constraint  [1]  of  the  modified  E-GENET  on  real  world  problems  and  further 
enhancement  of  E-GENET  to  solve  partial  constraint  satisfaction  problems  [6] 
and  optimization  problems. 
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Abstract.  We  propose  in  this  paper  a  novel  way  of  looking  at  local 
search  algorithms  for  combinatorial  optimization  problems  which  better 
suits  constraint  programming  by  performing  branch- and-bound  search  at 
their  core.  We  concentrate  on  neighborhood  exploration  and  show  how 
the  framework  described  yields  a  more  efficient  local  search  and  opens 
the  door  to  more  elaborate  neighborhoods.  Numerical  results  are  given  in 
the  context  of  the  traveling  salesman  problem  with  time  windows.  This 
work  on  neighborhood  exploration  is  part  of  ongoing  research  to  develop 
constraint  programming  tabu  search  algorithms  applied  to  routing  prob¬ 
lems. 


Introduction 

Local  search  methods  in  operations  research  (or)  date  back  to  over  thirty 
years  ago  ([Lin65]).  Applied  to  difficult  combinatorial  optimization  problems, 
this  heuristic  approach  yields  high-quality  solutions  by  iteratively  considering 
small  modifications  (called  local  moves)  of  a  good  solution  in  the  hope  of  finding 
a  better  one.  Used  within  a  strategy  designed  to  escape  local  optima  such  as 
simulated  annealing  and  tabu  search,  it  has  been  very  successful  in  achieving 
near-optimal  (and  sometimes  optimal)  solutions  to  a  variety  of  hard  problems 
([GLTdW93][Ree93]). 

In  solving  real-life  combinatorial  optimization  problems,  constraint  program¬ 
ming  (cp)  has  to  date  almost  invariably  adopted  the  branch-and-bound  strat¬ 
egy,  a  global  and  therefore  complete  search  method.^  When  an  exact  alprithm 
proved  too  costly,  approximate  algorithms  were  often  devised  by  heuristically 
discarding  the  least-promising  edges  in  the  branch-and-bound  search  tree.  The 
lack  of  popularity  of  local  search  in  constraint  logic  programming  may  be  at¬ 
tributed  to  the  apparent  need  to  modify  solution  variables  when  performing  a 
local  move. 

^  Though  incomplete  search  methods  have  been  successfully  introduced  in  the  con¬ 
straint  satisfaction  community. 
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A  few  exceptions  are  nevertheless  found  in  the  CP  literature.  In  the  context  of 
a  locomotive  scheduling  problem  reduced  to  a  traveling  salesman  problem  with 
deadlines,  [Pug92]  opts  for  an  iterative  improvement  method  which  repeatedly 
looks  for  a  local  move  improving  on  the  current  best  solution.  Local  moves  are 
defined  here  as  a  particular  replacement  of  three  edges  in  the  solution.  Every 
possible  move  is  performed  and  then  checked  against  the  constraints.  [CSKA94] 
use  the  vehicle  routing  problem  to  compare  a  standard  branch-and-bound  ap¬ 
proach  to  iterative  improvement  with  local  moves  defined  as  the  exchange  of  two 
vertices  in  the  route.  Again  the  constraints  ensure  that  only  feasible  moves  are 
considered.  [CL95]  describe  a  disjunctive  scheduling  system  of  which  local  search 
is  a  component.  Two  types  of  local  moves  are  considered:  “repair”  moves  swap 
two  tasks  scheduled  on  the  same  machine  while  “shuffle”  moves  only  keep  part 
of  the  solution  and  search  through  the  rest  of  the  solution  space  to  complete  it, 
guided  by  constraint  propagation.  The  latter  can  be  seen  as  branch-and-bound 
starting  from  some  internal  node  of  a  search  tree  for  the  original  problem.  A 
limited  number  of  backtracks  is  allowed  since  the  purpose  of  local  search  in 
this  context  is  to  quickly  improve  an  initial  solution  before  resorting  to  (full) 
branch-and-bound  search. 

All  of  the  above,  with  the  exception  of  “shuffle”  moves,  essentially  consider 
each  of  the  possible  moves  individually  to  then  assess  their  feasibility  and  cost. 
This  can  be  a  costly  endeavor  in  both  OR  and  CP  when  the  nature  of  the  local 
moves  is  such  that  a  great  number  of  possibilities  must  be  considered.  In  a  way, 
“shuffle”  moves  bear  a  resemblance  to  what  we  have  in  mind.  We  believe  that 
local  search  is  not  so  foreign  to  branch-and-bound  search  and  that  constraints 
can  be  more  actively  involved  in  the  exploration  of  these  local  search  spaces.  We 
propose  in  this  paper  a  clean  integration  of  local  search  in  constraint  program¬ 
ming  by  keeping  on  doing  what  comes  naturally,  i.e.  branch-and-bound,  but  on 
a  different  search  space,  though  related  to  the  original  one.  In  contrast  with  OR, 
the  resulting  framework  maintains  a  clear  separation  between  the  constraints 
of  the  problem  and  the  actual  search  procedure.  For  both  OR  and  CP,  the  po¬ 
tential  pruning  capabilities  open  the  door  to  more  elaborate  local  moves,  which 
could  lead  to  even  better  approximate  results.  In  addition,  it  does  not  require 
modifying  the  value  of  (logic)  variables. 

The  rest  of  the  paper  is  organized  as  follows.  Section  1  first  gives  an  overview 
of  local  search  methods  in  OR.  Our  general  framework  for  local  search  in  cp  is 
then  presented  in  section  2  to  be  followed  by  an  instance  of  it  in  section  3.  Finally, 
an' experimental  evaluation  in  section  4  provides  insight  into  the  potential  gain 
of  such  an  approach. 


1  Local  Search  Methods  in  Operations  Research 

Local  search  methods  generally  involve  repeatedly  going  from  one  solution  to 
another  through  a  local  move.  What  constitutes  a  valid  local  move  will  vary 
according  to  the  problem  and  even  within  it,  as  we  shall  see  in  section  3.1.  The 
set  of  all  solutions  reachable  from  a  solution  s  through  a  local  move  is  termed 
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the  neighborhood  of  s.  The  set  of  all  feasible  solutions  in  this  neighborhood  will 
be  called  its  feasible  neighborhood. 

Given  this  framework,  a  simple  strategy  called  iterative  improvement  moves 
to  the  best  feEisible  neighbor  (i.e.  of  lowest  cost)  every  time  until  it  does  not 
improve  on  the  current  solution,  reaching  a  local  optimum.  Some  ways  of  alle¬ 
viating  the  obvious  drawback  of  this  strategy  have  been  proposed.  Multi- start 
iterative  improvement  achieves  local  optima  from  a  pool  of  solutions  and  returns 
the  best  one.  Genetic  local  search  builds  upon  the  previous  by  recombining  the 
local  optima  (in  the  fashion  of  genetic  algorithms  [Hol75]),  applying  iterative  im¬ 
provement,  discarding  the  least-promising  solutions  and  repeating  the  process 
until  some  stopping  criterion  is  satisfied. 

Two  very  successful  strategies  try  to  escape  local  optima  by  allowing  moves 
which  temporarily  increase  the  cost  of  the  solution.  Tabu  search  ([Glo77])  moves 
to  the  best  neighbor  at  each  iteration,  regardless  of  whether  or  not  it  improves  on 
the  current  solution.  To  avoid  cycling,  a  dynamic  list  of  tabu  solution  attributes  is 
kept.  Typically,  such  a  list  covers  recently  examined  solutions,  which  will  remain 
forbidden  for  a  certain  number  of  iterations.  Simulated  annealing  ([KGJV83]) 
randomly  selects  a  neighbor  at  each  iteration.  If  it  improves  on  the  current 
solution,  the  move  is  performed;  otherwise,  it  will  be  performed  with  a  certain 
probability  which  depends  on  the  cost  difference  and  which  also  decresises  over 
time  according  to  a  cooling  schedule.  Both  strategies  iterate  until  some  stopping 
criterion  is  satisfied. 

One  crucial  aspect  in  all  of  these  local  search  methods  is  obviously  the  choice 
of  the  neighborhood  structure  (see  for  example  [GTdW93]).  Ambitious  neigh¬ 
borhoods  increase  the  chances  of  success  but  are  more  expensive  to  explore  for 
methods  which  need  to  do  so.  Small  neighborhoods  are  both  simple  and  fast  to 
explore  but  may  prevent  us  from  ever  reaching  a  particularly  good  solution:  it 
could  require  a  sequence  of  local  moves  (as  opposed  to  a  single  one  in  a  larger 
neighborhood),  and  in  tabu  search  every  such  move  would  have  to  be  the  best 
one  locally.  A  neighborhood  which  strictly  includes  another  induces  a  search 
which  encounters  fewer  local  optima  and  thus  facilitates  their  avoidance. 

Because  of  the  above,  large  neighborhoods  seem  to  be  the  current  trend, 
as  exemplified  by  the  recent  CROSS  exchange  ([TBG'^OS])  which  generates  a 
neighborhood  of  size  0{n^).  Several  techniques  have  been  developed  to  speed 
up  this  exploration.  The  size  of  the  neighborhood  can  be  somewhat  reduced  ei¬ 
ther  by  ignoring  parts  of  it  which  are  unlikely  to  produce  good  solutions  or,  in 
a  more  exact  fashion,  by  interrupting  a  carefully  engineered  exploration  when 
the  remainder  can  only  lead  to  worse  (or  infeasible)  solutions.  In  addition,  the 
feasibility  of  neighbors  must  be  assessed.  As  an  early  example  in  routing  prob¬ 
lems,  [Sav85]  describes  a  way  to  verify  time  window  constraints  in  constant  time 
per  neighbor,  though  assumptions  about  the  neighborhood  structure  must  be 
made.  Others  may  perform  approximate  tests  of  feasibility  to  quickly  identify 
promising  neighbors  which  are  then  thoroughly  investigated,  with  the  potential 
risk  of  missing  the  best  one. 

So,  the  challenge  of  expressive  neighborhoods  has  been  met  with  specialized 
techniques  embedded  in  the  local  search  and  sometimes  with  compromises. 
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2  Local  Search  as  Branch-and-Bound 

This  adaptation  of  local  search  to  the  constraint  programming  paradigm  stems 
from  its  perception  as  a  generalization  of  traditional  branch-and-bound  search. 
In  the  latter  case,  we  branch  on  the  variables  of  the  model  and  the  neighborhood 
degenerates  into  the  whole  solution  search  space  for  which  a  single  iteration  be¬ 
comes  sufficient  since  the  optimal  solution  will  necessarily  be  found.  In  order  to 
lift  this  approach  to  local  search,  the  art  then  consists  of  choosing  a  representa¬ 
tion  for  a  particular  neighborhood  structure  which  CP  branch-and-bound  search 
can  exploit.  Each  iteration  of  local  search  will  simply  be  a  branch-and-bound 
search,  but  on  a  different  search  space.  In  the  remainder,  we  shall  concentrate 
on  the  way  neighborhood  structures  are  explored  during  a  single  iteration.  The 
arboreal  exploration  at  the  heart  of  a  branch-and-bound  strategy  remains  the 
way  we  examine  the  neighborhood  of  a  solution.  To  complete  the  parallel,  the 
active  role  of  modeling  constraints  together  with  lower  bounds  on  the  cost  of 
partial  solutions  will  help  prune  the  tree  and  thus  reduce  the  search  effort  over 
the  whole  neighborhood. 

We  now  formalize  the  idea  presented  above.  Let  M  denote  some  neighbor¬ 
hood  structure  for  solutions  to  a  problem  7^.  A  set  of  finite  domain  variables 
{i/i, . . .  usually  distinct  from  the  variables  appearing  in  the  model  for  Vy 
together  with  a  (possibly  empty)  set  of  constraints  on  {t/i, . .  .yi/k]  is  a  neigh¬ 
borhood  model  for  if  there  is  a  one-to-one  mapping  between  the  set  of  feasible 
combinations  of  values  for  {t^i, . . . ,  Vk}  and  the  neighbors  in  A/". 

To  illustrate  this,  consider  again  the  neighborhood  of  [CSK A94]  for  a  route  on 
m  cities.  We  introduce  variables  I  and  J  both  ranging  over  1,  2, . . . ,  m  and  with 
the  constraint  I  <  J  between  them.  If  the  solution  is  represented  as  (ci , . . . ,  Cm), 
the  sequence  of  cities  forming  the  route,  a  particular  (feasible)  combination  of 
values  for  {/,  J}  is  interpreted  as  exchanging  entries  c/  and  cj  in  that  solution 
to  obtain  a  neighbor.  One  easily  verifies  that  this  constitutes  a  neighborhood 
model. 

The  requirement  of  a  one-to-one  mapping  could  be  relaxed  to  a  surjective 
mapping  though  this  would  mean  that  a  neighbor  may  be  examined  more  than 
once.  For  example  without  the  constraint  /  <  J,  symmetries  would  lead  to 
identical  solutions.  However,  surjectivity  is  crucial  since  otherwise  we  would  miss 
some  of  the  neighbors. 

So  we  shall  branch  on  {z/i, . . .,  Vk]  and  also  bound  the  cost  of  partially  con¬ 
structed  neighbors.  As  we  saw  in  section  1,  some  heuristic  algorithms  based  on 
local  search,  such  as  tabu  search  and  variations  on  iterative  improvement,  are 
interested  in  acquiring  the  best  solution  in  the  neighborhood  and  hence  do  not 
require  to  manipulate  the  whole  of  it  per  se.  We  can  therefore  record  the  cost  of 
the  best  neighbor  found  so  far  and  compute  lower  bounds  for  partial  neighbors 
to  further  reduce  the  portion  of  the  neighborhood  that  has  to  be  explored,  as  in 
traditional  branch-and-bound. 
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We  see  two  main  advantages  to  such  an  approach: 

—  The  modeling  constraints  for  V  (and  often  some  of  the  variables)  are  kept 
separate  from  the  neighborhood  model.  This  is  especially  useful  when  lots  of 
different  side  constraints  are  present  in  the  problem:  they  will  not  clutter  the 
local  search  with  explicit  feasibility  tests  as  is  often  the  case  in  operations 
research  heuristics.  They  will  rather  be  indirectly  involved  when  some  of  the 
modeling  variables  will  be  further  constrained  as  a  result  of  choices  made  on 
{i^i, . . . ,  z/jt}  (section  3  describes  in  detail  an  instance  of  this).  The  end  result 
is  a  generic  neighborhood  exploration  method  parameterized  by  the  type  of 
neighborhood  but  not  by  the  modeling  constraints. 

“  There  is  a  strong  relationship  between  the  savings  brought  about  by  this 
branch-and-bound  approach  and  how  ambitious  the  neighborhood  is.  Typ¬ 
ically,  enlarging  the  neighborhood  means  increasing  the  degrees  of  freedom 
and  so  requires  a  greater  number  of  variables  to  encode  its  structure.  This 
translates  into  a  greater  depth  of  the  (neighborhood)  search  tree  and  a  po¬ 
tentially  larger  gain  with  every  branch  pruned,  either  from  the  lower  bound 
at  the  particular  node  or  the  modeling  constraints.  This  constitutes  an  cisset 
in  view  of  the  current  trend  toward  more  ambitious  neighborhood  structures. 


3  An  Example  for  the  Traveling  Salesman  Problem  with 
Time  Windows 

This  section  provides  a  larger  and  more  interesting  instance  of  the  general  frame¬ 
work  just  described.  It  addresses  a  well-known  problem  on  which  several  local 
search  heuristics  have  been  applied  in  the  past.  The  traveling  salesman  problem 
with  time  windows  (tsptw)  consists  of  finding  a  minimum  cost  (usually  the 
total  travel  distance  or  total  schedule  time)  tour  of  a  set  of  cities  where  each  city 
is  visited  exactly  once  and  which  starts  and  ends  at  a  unique  depot.  In  addition, 
each  city  must  be  visited  within  its  own  time  window.  Early  arrival  is  allowed 
but  implies  a  waiting  time  until  the  beginning  of  the  window.  [Sav85]  showed 
that  simply  deciding  whether  there  exists  a  feasible  solution  to  an  instance  of  the 
TSPTW  is  NP-complete.  Nevertheless,  the  full  problem  has  important  applica¬ 
tions  in  bank  and  postal  deliveries,  school-bus  routing  and  scheduling,  disjunctive 
scheduling  with  sequence-dependent  processing  times,  automated  manufactur¬ 
ing  environments  and  as  a  subproblem  of  the  vehicle  routing  problem  with  time 
windows  (vRPTw). 

We  will  sketch  a  constraint  programming  model  for  the  tsptw  since  some  of 
its  variables  will  be  referred  to  later  on.  The  actual  details  of  this  model  appear 
in  [PGPR96]  but  are  not  relevant  here.  Let  V  =  {2, . . . ,  n}  represent  the  cities 
to  visit  and  duplicate  the  unique  depot  into  an  origin-depot  and  a  destination- 
depot,  identified  as  1  and  nH-1,  respectively.  A  tour  thus  becomes  a  Hamiltonian 
path  starting  at  1  and  ending  at  n  -{- 1.  At  the  heart  of  the  model  are  variables 
S',*,  i~  1, . . .,  n  associated  to  each  of  the  cities  (and  the  origin-depot)  and  which 
represent  their  successor  in  the  tour.  Their  domain  will  therefore  be  an  integer 
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in  the  range  2, . . n  -f  1.  A  valid  tour  assigns  a  distinct  successor  to  each  city 
and  avoids  sub-tours.  We  also  define  predecessor  variables  Pj,  .j  =  2, . .  .,n  -|-  1 
which  represent  the  symmetric  counterpart  of  the  SPs  (Si  =  j  <=>■  Pj  =  i).  To 
account  for  the  scheduling  component  of  the  problem,  variables  Ti  are  introduced 
to  represent  the  time  at  which  we  visit  each  city.  These  must  take  a  value  within 
their  respective  time  window  while  being  coherent  with  the  order  in  which  cities 
appear  on  the  tour  and  taking  into  consideration  the  travel  time  between  cities. 


3.1  Some  Neighborhoods  for  Routing  Problems 

Given  the  inherent  difficulty  of  the  problem,  several  OR  heuristic  algorithms 
have  been  designed  and  among  them  some  based  on  local  search.  We  briefly 
describe  the  most  popular  neighborhoods  in  the  context  of  routing  problems, 
which  include  the  TSPTW  but  may  allow  solutions  consisting  of  a  set  of  routes 
and  involve  other  restrictions  such  as  capacity  constraints. 

Neighborhoods  generated  by  modifications  at  the  level  of  vertices  include 
re-inserting  a  vertex  or  exchanging  two  vertices.  GEN  I,  a  generalized  insertion 
procedure  ([GHL92]),  directs  the  re-insertion  between  some  of  its  p  (typically 
5)  nearest  neighbors.  In  the  X-interchange  ([Osm93]),  subsets  of  vertices  Si^Sj 
(with  \Si\  <  X  and  \Sj\  <  A)  are  selected  each  from  a  different  route  and  then 
swapped.  The  possibility  of  an  empty  subset  allows  simply  re-inserting  vertices. 
For  efficiency  reasons,  A  rarely  exceeds  2. 

At  the  level  of  edges,  a  k-inierchange  ([Lin65])  replaces  k  edges  in  the  solution 
by  k  others  that  reconnect  the  route(s).  We  explore  the  associated  neighborhood 
to  find  a  k-opi  solution,  which  cannot  be  improved  by  a  A:-inter change  and  is 
therefore  a  local  optimum.  Various  reported  experiments  use  Ar  =  2  or  3.  An 
Or- opt  move  ([Or76])  relocates  a  string  of  one,  two  or  three  consecutive  vertices. 
A  S-opt*  move  ([PR95])  replaces  two  edges  (vi,Vj),  (vk,vi)  from  different  routes 
with  edges  (u,*,  (vjb,  vy).  This  exchanges  the  end  portion  of  one  route  with  the 
end  of  the  other.  The  CROSS  exchange  ([TBG‘^95])  goes  further  by  exchanging 
a  middle  portion  of  one  route  with  a  middle  portion  of  the  other  and  includes 
the  two  previous  types  of  moves  as  special  cases. 


3.2  3-Opt  Edge  Exchange 

We  will  give  a  neighborhood  model  for  the  3-interchange  neighborhood.  Let  T 
be  the  current  tour.  If  I  represents  a  city  on  this  tour,  then  (resp.  I~)  denotes 
its  successor  (resp.  predecessor)  on  T.  We  define  the  following  binary  relation  on 
cities:  I  -<r  J  holds  if  I  appears  before  J  on  T  (the  subscript  will  be  dropped 
unless  necessary  to  avoid  confusion).  I  <  J  will  be  used  as  shorthand  for  “7  -<  J 
or  I  is  the  same  as  7” . 

Without  loss  of  generality,  lei  I  -<  J  <  K,  N  3-inierchange  move  from  T 
deletes  the  three  edges  (/,/+),  (7~,  J),  (7l, 7C+)  and  then  reconnects  the  tour 
by  introducing  three  new  ones  on  the  vertices  (cities)  7,  7"^,  7",  7,  K  and 
7f+.  An  orientation-preserving  3-interchange  move  reconnects  the  tour  in  the 
only  possible  way  which  does  not  reverse  any  of  the  original  route  segments,  by 


359 


Fig.  1.  The  orientation-preserving  3-interchange. 


adding  edges  (i,  J),  (iC, /■*■),  (see  figure  1).  This  is  desirable  vs^hen  a 

scheduling  component  is  present  in  the  problem,  such  as  time  windows.  The  new 
tour  would  then  start  at  the  depot  D,  follow  the  same  path  to  7  as  in  T,  go  to 
J,  proceed  to  K  as  in  T,  go  to  7*^,  proceed  to  J~  as  in  T,  go  to  and  finally 
return  to  7)  as  in  T.  Two  segments  of  the  original  route  have  been  swapped  but 
the  ordering  within  a  segment  is  kept. 

The  neighborhood  defined  by  orientation-preserving  3-interchange  moves  can 
be  encoded  through  three  finite  domain  variables  7, 7,  Ti ,  ranging  from  1  to  n. 
Their  interpretation  is  the  one  given  above.  Because  of  the  original  constraint 
^  ^  J  each  neighbor  is  in  one-to-one  correspondence  with  a  3- tuple  of 

values  describing  it.  We  therefore  have  a  neighborhood  model. 

Initially,  the  CP  model  for  the  problem  is  stated  and  the  variables  of  that 
model  (  Si's,  Pj's  and  Ti*s)  are  only  constrained  to  that  extent  —  in  other 
words,  they  are  not  bound  to  the  current  solution  T.  Then  the  local  search 
takes  place.  Within  our  tree  search,  the  variable  ordering  will  be  fixed  to  7,  7,  TT. 
We  now  describe  the  information  exploited  at  each  level  of  this  tree  (refer  to 
figure  1  throughout). 

level  0,  the  root  of  the  tree; 

Initially,  we  simply  have  that  7  G  {1, . . . ,  n},  J,K  G  {2, . . . ,  n},  7  -<  7  X  Tf . 
That  latter  constraint  will  become  active  as  the  domains  of  7,  7,  K  shrink 
(and  eventually  hold  a  single  value), 
level  1,  7  fixed: 

1.  The  path  from  Z)  to  7  is  identical  to  that  in  T.  We  can  perform  the 
following  variable  bindings:  Sd  =  =  7>'*"^, . . . ,  Sj-  =  7. 
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2.  no  longer  belongs  to  the  domain  of  5/  since  that  edge  will  be  removed, 
and  the  domain  of  J  can  be  identified  with  that  of  Sj. 

3.  Similarly,  I  no  longer  belongs  to  the  domain  of  P/+  and  the  domain  of 
K  can  be  identified  with  that  of  P/+ . 

level  2,  /,  J  fixed: 

1.  Edge  (/,  J)  is  added  and  the  path  from  to  J~  is  identical  to  that  in 

T :  variables  Si  and  5'/+ ,  Sj-\-+  • . . ,  —  become  bound. 

2.  K'^  must  be  one  of  the  values  in  the  domain  of  5jr- .  Consequently,  the 
domain  of  K  can  be  further  tightened:  any  value  for  K  must  appear  in 
{E~  I  E  6  domain  of  Sj-  }. 

level  3,  7,  J,  K  fixed: 

We  have  reached  a  leaf  of  our  tree  —  the  rest  of  the  tour  can  be  completed 

by  fixing  the  rest  of  the  5,  ’s  to  their  appropriate  value. 

Applying  branch-and-bound  yields  an  orientation-preserving  3-opt  move  (we 
will  discuss  lower  bounds  in  section  3.3).  At  different  stages  of  the  search,  we 
manage  to  express  constraints  that  restrict  the  set  of  allowable  values  for  the 
three  variables  on  which  that  search  is  performed  but,  more  importantly,  we 
constrain  as  well  the  principal  modeling  variables  (the  5,-^s)  which  may  indepen¬ 
dently  prune  our  search  tree  by  propagating  these  changes  through  modeling 
constraints  we  need  not  know  anything  about.  Note  that  the  time  variables  Ti 
were  not  directly  involved  either. 

By  the  way,  the  neighborhood  structure  just  described  also  corresponds  to 
the  one  in  [Pug92].  As  for  the  other  ones  we  encountered  in  the  introduction, 
the  “repair”  moves  of  [CL95]  could  be  modeled  as  those  of  [CSKA94]  with  an 
extra  variable,  say  M,  to  represent  the  choice  of  a  machine;  to  fit  “shuffle” 
moves  into  our  framework,  since  branching  is  done  not  on  the  value  of  a  variable 
but  on  the  ordering  of  a  pair  of  tasks,  one  could  associate  a  neighborhood  model 
binary  variable  i/ij  to  every  pair  left  unordered  and  introduce  constraints  to 
propagate  the  transitivity  of  the  partial  order  relation.  The  other  neighborhood 
structures  described  in  section  3.1  are  easily  modeled. 


3.3  Lower  Bounds  on  the  Cost  of  a  3-Interchange  Neighbor 

Associate  a  cost  Cij  to  every  edge  {i,j)  and  let  C  =  j)eT  ^0'  total  cost 

of  T.  The  cost  of  a  neighbor  (/,  J,  K)  of  T  is  given  by  the  following  formula, 
replacing  the  costs  of  the  old  edges  by  that  of  the  new: 

C  -f-  (c/j  -  C//+)  +  {cj^i+  -  CICIC+)  -I-  {Cj-k;+  -  Cj-j),  (i) 

Given  J,  J,  AT,  we  can  therefore  compute  this  cost  in  constant  time.  In  the 
spirit  of  branch-and-bound  search,  we  will  seek  lower  bounds  on  the  cost  of 
incomplete  neighbors  as  we  traverse  the  tree.  The  following  formulas  are  easily 
derived  for  the  two  interesting  levels,  1  and  2.  If  only  I  has  been  fixed  (level  1), 
we  have: 

C  +  (min{c/j}  -  c//+)  +  min{(cjr/+  -  ckk*)  +  (cj-^+  -  cj-j)}.  (2) 

J  ^ 
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Similarly,  if  both  I  and  J  have  been  fixed  (level  2): 

C  +  {cjj  -  C//+)  -f  (n:nn{(cj^7+  -  ckk-^)  +  cj-7:+}  -  Cj- j).  (3) 

iv 

While  the  latter  can  be  computed  in  0{n)  time,  (2)  requires  0{ri^)  time  because 
of  the  last  term  in  the  sum.  On  the  other  hand,  the  number  of  possible  values  for 
J  and  K  once  I  has  been  selected  could  often  be  a  small  fraction  of  n.  We  may 
achieve  a  tighter  lower  bound  by  insisting  that  the  minimizing  J  be  the  same  in 
the  second  and  third  terms  of  (2),  yielding: 

c  +  +  (cjr/+  -  ckk+)  +  {cj-k+  -  cj-j)}  -  cji+).  (4) 

J 

From  an  implementation  point  of  view,  the  effect  of  (l)-(4)-(3)  can  be  achieved 
by  initially  posting  a  constraint  relating  the  cost  of  the  neighbor  to  the  value  of 

4  Experimental  Results 

In  order  to  evaluate  the  potential  of  the  ideas  developed  in  the  paper,  the 
neighborhood  model  detailed  in  the  previous  section  was  tested  on  problem 
instances  taken  from  the  literature.  The  TSPTW  constraint  programming  model 
of  [PGPR96]  was  used  to  describe  the  problem. 

We  considered  two  sets  of  symmetric  Euclidean  problems  with  travel  times 
between  cities  taken  as  the  distance  separating  them.  The  first  set  comes  from 
[DDGS95]  and  features  instances  on  20  cities  uniformly  distributed  on  the  [0, 50]  x 
[0, 50]  grid  with  time  windows  of  maximum  width  20,  40,  60,  80  and  100.  These 
problems  tend  to  be  fairly  constrained.  The  second  set  uses  subproblems  of  the 
well-known  vrptw  test  bed  in  [Sol87].  The  original  problem  instances  feature 
100  cities  distributed  on  the  [0,100]  x  [0,100]  grid  and  require  several  vehicles 
to  service  them  while  obeying  the  side  constraints.  Some  partition  of  the  cities 
such  that  each  group  may  be  visited  by  a  single  vehicle  yields  our  subproblems. 
Their  resulting  size  varies  from  16  to  49  cities.  We  used  some  of  the  instances 
in  the  C2,  R2  and  RC2  classes  of  problems  for  which  the  cities  are  respectively 
clustered,  uniformly  distributed  and  a  mixture  of  the  two.  This  second  set  is 
more  heterogeneous  and  instances  sometimes  include  very  few  meaningful  time 
windows. 

We  introduce  two  types  of  indicators  to  analyze  the  results: 

_  jfeasible  neighborhood] 

|neighborhood| 

will  indicate  how  constrained  a  problem  instance  is  by  evaluating  for  a  given 
solution  the  proportion  of  neighbors  which  are  feasible; 

_  ^backtracks 
|neighborhood| 
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will  measure  the  effort  put  into  our  neighborhood  exploration  by  comparing  the 
number  of  backtracks  required  with  the  size  of  the  neighborhood.  Both  7  and 
e  necessarily  range  between  0  and  1.  Tests  were  conducted  with  and  without 
making  use  of  lower  bounds  on  the  cost  of  partial  solutions  in  order  to  evaluate 
their  impact.  The  search  effort  for  the  case  without  lower  bounds  will  be  denoted 

The  results  on  the  first  set  are  summarized  in  figure  2  and  in  figures  3,  4 
and  5  for  the  three  classes  in  the  second  set.  Each  value  reported  represents  an 
average  over  a  few  problem  instances  and  solutions  to  them. 
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Fig,  2.  Indicators  7,  e  and  e  for  some  instances  in  the  first  set  of  problems. 


Our  first  observation  is  that  practically  all  these  problem  instances  are  quite 
constrained  in  the  neighborhood  of  the  solutions  considered:  7  averages  0.03 
and  never  rises  above  0,18.  Avoiding  the  systematic  exploration  of  the  whole 
neighborhood  therefore  appears  advantageous.  The  relatively  small  extent  of 
this  first  experiment  makes  it  premature  to  conclude  that  the  problem  instances 
have  quite  constrained  neighborhoods  for  all  their  solutions  but  it  would  be 
interesting  to  verify  this.  Secondly,  our  search  effort  is  often  a  small  fraction  of 
the  neighborhood  search  space:  e“  and  e  average  0.13  and  0.09  respectively.  In  the 
worst  case,  e”  approaches  0.5  (figure  5,  rc208)  but  e  does  not  exceed  0.29  (figure 
4,  r204).  The  beneficial  effect  of  maintaining  lower  bounds  in  the  neighborhood 
model  goes  as  far  as  cutting  down  by  half  the  number  of  backtracks  on  some 
instances.  Though  the  relative  merits  of  €"  and  e  have  little  to  do  with  the  CP 
model  used  here,  their  actual  values  do.  The  model  used  includes  redundant 


Fig.  5.  Indicators  7,  e  and  e  for  some  instances  in  RC2. 


constraints  for  more  powerful  propagation  which  ends  up  producing  smaller  e’s 
but  also  slows  down  the  exploration.  As  usual,  a  reasonable  compromise  must 
be  sought.  It  is  worth  mentioning  that  sometimes  e  was  even  smaller  than  7 
(e.g.  figure  3,  c208)  —  in  such  cases,  the  bounds  were  particularly  productive  in 
pruning  subtrees  containing  unattractive  feasible  neighbors. 

Looking  at  the  sets  of  problems  individually,  all  three  indicators  in  figure 
2  tend  to  increase  with  the  width  of  the  time  windows,  as  one  would  expect. 
The  intriguing  periodicity  apparent  in  figures  3,  4  and  5  is  probably  due  to  the 
way  these  problems  were  originally  generated.  [Sol87]  mentions  that  different 
percentages  of  cities  with  time  windows  were  used,  namely  100,  75,  50  and  25%. 
A  greater  proportion  of  constraining  time  windows  will  likely  decrease  the  density 
of  feasible  neighbors. 

Conclusion 

We  have  proposed  a  novel  way  of  looking  at  local  search  algorithms  in  constraint 
programming.  By  maintaining  branch-and-bound  search  at  their  core,  we  believe 
that  a  cleaner  and  more  natural  integration  is  achieved.  In  addition,  the  famil¬ 
iar  pruning  which  can  take  place  yields  substantial  savings  on  the  number  of 
neighbors  that  actually  need  to  be  considered. 

The  extra  efficiency  brought  about  by  lower  bounds  on  the  cost  of  partial 
solutions  makes  more  attractive  methods  which  explore  the  whole  neighborhood 
in  search  of  the  best  local  move,  such  as  tabu  search.  We  certainly  advocate  the 


365 


use  of  a  local  search  method  more  subtle  than  plain  iterative  improvement,  as 
others  within  the  CP  community  have  already  suggested. 

This  work  on  neighborhood  exploration  is  part  of  ongoing  research  to  develop 
CP  tabu  search  algorithms  applied  to  routing  problems. 
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Abstract.  This  paper  describes  an  Evolutionary  Algorithm  that  re¬ 
pairs  to  solve  Constraint  Satisfaction  Problems.  Knowledge  about  prop¬ 
erties  of  the  constraints  network  can  permit  to  define  a  fitness  function 
which  is  used  to  improve  the  stochastic  search.  A  selection  mechanism 
which  exploits  this  fitness  function  has  been  defined.  The  algorithm  has 
been  tested  by  running  experiments  on  randomly  generated  3-colouring 
graphs,  with  different  constraints  networks.  We  have  also  designed  a 
specialized  operator  “permutation”,  which  permits  to  improve  the  perfor¬ 
mance  of  the  classic  crossover  operator,  reducing  the  generations  number 
and  a  faster  convergence  to  a  global  optimum,  when  the  population  is 
staying  in  a  local  optimum.  The  results  suggest  that  the  technique  may 
be  successfully  applied  to  other  CSP. 

Keywords:  Constraint  satisfaction.  Evolutionary  algorithms, Fitness  eval¬ 
uation 


1  Introduction 

Constraint  satisfaction  problems,  or  CSP  are  widely  used  in  artificial  intelligence. 
The  goal  is  to  find  the  values  for  problem  variables  that  satisfy  the  imposed 
constraints.  Genetic  Algorithms  have  been  applied  to  solving  CSOP,  [15],  and 
CSP,  [5],  [4],  [12],  [9],  [1],  [6],  however  the  researchers  have  been  concentrated 
in  studying  the  genetic  representation  and  the  reproduction  mechanisms  more 
than  the  evaluation  function  definition,  which  in  general,  have  been  defined  as 
the  number  of  satisfied  constraints. 

This  paper  introduces  a  new  evaluation  function.  This  fitness  function  uses  in¬ 
formation  about  the  connectivity  of  the  constraints  network,  which  is  represented 
by  a  constraints  matrix.  Such  network  offers  potential  advantages  in  terms  of 
knowledge  on  variables  interaction  (epistasis).  In  addition,  a  fitness-proportionate 
selection  algorithm  has  been  defined  to  improve  the  genetic  search.  The  remain¬ 
der  of  this  paper  is  organized  as  follows.  After  defining  what  we  mean  by  CSP 
in  relation  with  Genetic  Algoritms  (GA)  in  section  2,  an  evolutionary  algorithm 
with  a  new  approach  to  calculate  the  evaluation  function  is  presented  in  section 
3.  We  then  address  the  graph  3-colouring  problem  subject  to  the  restriction  that 
adjacent  nodes  in  the  graph  must  be  colored  differently  in  section  4,  it  also  con¬ 
tains  the  definition  of  a  new  operator  “permutation”  which  permits  to  improve 
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the  performance  of  the  classic  crossover  operator.  Pointers  to  future  research 
and  conclusions  are  given  in  section  5. 

2  Constraint  Satisfaction  Problems  and  Genetic 
Algorithms 

A  Constraiiit  S&tisfs,ctioR  Problem  (CSP)  is  composed  of  a  set  of  varicibles 
^  their  related  domains  Di^ . . .  jDn  and  a  set  containing  NC 

constraints  on  these  variables.  The  domain  of  a  variable  is  a  group  of  values 
to  which  the  variable  may  be  instantiated.  The  domain  sizes  are 
respectively, and  we  let  m  denote  the  maximum  of  the  m^.  Each  constraint 
is  releva,nt  to  a  subset  of  variables  . . . ,  where  {ji,  • . .  is  some  in¬ 
creasing  subsequence  of  {1, 2, . . . ,  n}.  It  may  be  regarded  as  containing  all  tuples 
of  values  over  this  set  of  variables  that  are  allowed  with  respect  to  A  con¬ 
straint  which  is  relevant  to  exactly  one  variable  is  called  a  unary  constraint. 
Similarly,  a  binary  constraint  is  relevant  to  exactly  two  variables.  A  solution  to 
the  CSP  consists  of  an  instantiation  of  all  the  variables  which  does  not  violate 
any  of  the  constraints, i.e., a  consistent  labeling  of  each  variable  with  a  value  from 
its  domain.  The  simplest  algorithm  is  the  brute  force  algorithm  (generate  and 
test),  which  simply  tries  every  possible  combination  of  values  being  instantiated 
to  the  variable.  In  Genetic  Algorithms  the  variables  will  be  the  genes  in  our 
representation  of  the  chromosome,  the  variables  values  will  be  the  alleles  and 
the  goal  is  to  find  an  instantiation  of  the  chromosome  which  does  not  violate 
any  constraint.  A  generate  and  test  algorithm,  is  equivalent,  in  the  worst  case,  to 
generate  a  population  size  m"'.  Obviously  in  this  population  we  will  have  at  least 
one  chromosome  solution  (if  the  CSP  has  a  solution).  However,  the  most  CSP 
are  computationally  NP-complete,  which  implies  that  there  are  no  known  poly¬ 
nomial  time  algorithms  which  can  guarantee  finding  a  solution,  owing  in  part  to 
the  size  of  domains,  to  the  size  of  variables  and  to  the  structure  of  the  constraints 
network.  Traditionally  the  effort  of  the  research  community  on  constraints  has 
at  temped  to  develop  techniques  to  improve  the  algorithm  performance  using  the 
knowledge  on  the  constraints, [3],  [11],  [7],  for  example,  by  pruning  the  search 
space.  In  order  to  use  the  knowledge  on  constraints  in  genetic  algorithms,  we 
concentrate  our  attention  on  the  constraints  network,  which  we  have  represented 
by  a  constraints  matrix.  For  simplicity  we  restrict  our  attention  here  to  binary 
CSPs,  in  which  the  constraints  involve  two  variables. 

—  Definition  1:  A  Constraints  Matrix  R  is  an  NC  x  n  rectangular  array  with 
elements,  which  presents  the  information  inherent  in  a  constraints  graph. 
The  n  columns  correspond  to  variables  and  the  NC  rows  correspond  to 
constraints.  An  element  in  the  ith  row  and  jth  column  will  be  "1"  if  the 
variable  j  is  relevant  with  respect  to  constraint  i,  and  it  will  be  ”0"  if  it  is 
irrelevant.  Figure  1  presents  a  constraints  matrix  and  its  constraints  graph. 

An  evolutionary  algorithm  has  been  created,  which  is  a  genetic  algorithm  trans¬ 
formed  to  suit  the  CSP,  in  order  to  use  the  knowledge  on  constraints  and  vari- 
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Fig.  1.  Constraint  Network 


ables.  Our  approached  is  based  on  the  first  evolution  programming  principle: 
appropriate  data  structures  should  do  the  job  of  taking  care  of  constraints,  [10]. 


3  An  Evolutionary  Algorithm  for  CSP:  FCA 

Designing  an  evolutionary  algorithm  (EA)  involves  the  following  six  components: 

1.  An  initial  population 

2.  A  genetic  representation  of  chromosomes 

3.  Genetic  operatorsxrossover,  mutation,  other  specialized  operators. 

4.  An  evaluation  function 

5.  A  selection  algorithm 

6.  Parameters:  population  size,  probabihties  of  genetic  operators. 

The  structure  of  an  evolutionary  algorithm  is: 


Begin  /*  procedure  Evolutionary  Algorithm  */ 
t  =  0 

initialize  population  P(t)  (1) 

evaluate  fitness  of  indivuals  in  P(t)  (4) 

while  (not  termination- condition)  do 
begin 
t=t+l 

Parents  =  select-parents- from  P(t-l)  (5) 
Children  =  alter  Parents  (3) 

P(t)  -  Children 

evaluate  P(t)  (4) 

end 

endwhile 

End  /*  procedure  Evolutionary  Algorithm  */ 


Fig.  2.  Structure  of  an  EA  and  its  components 


Our  research  effort  has  been  principally  concentrated  on  a  genetic  represen¬ 
tation  of  chromosomes,  on  the  evaluation  function  (or  fitness  function),  on  the 
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selection  algorithm,  and  on  the  genetic  operator  crossover,  i.e.,  components  2, 
3,  4,  5. 


3.1  Initial  Population 

The  initial  population  is  generated  randomly  (component  l).The  variable’s  val¬ 
ues  are  selected  from  their  domains  with  an  uniform  probability  distribution. 

3.2  Genetic  Representation 

We  have  selected  a  non-binary  genetic  representation  taking  into  consideration 
that  the  variables  domains  involved  in  a  CSP  could  be  of  different  types, i.e., 
a  constraint  satisfaction  problem,  in  general,  has  variables  with  real  domains, 
boolean  domains,  class  domains.  Therefore  our  genetic  representation  has  the 
structure  shown  in  Figure  3. 


X1  X2  X3  .  Xi  .  Xn 


A 

1 

1024 

0 

U 

Fig.  3.  Chromosome  representation 


3.3  Genetic  Operators 

We  have  worked  in  our  first  set  of  tests  with  the  classical  mutation  and  crossover 
genetic  operators,  for  the  component  3  (alter  Parents),  [8].  After  we  have  incor¬ 
porated  a  “permutation”  operator  which  helps  to  crossover  operator  when  the 
population  is  staying  in  a  local  optimum.  In  the  second  set  of  tests  we  have  used 
alone  a  good  specialized  asexual  operator  (#,r,b),  defined  by  Eiben  in  [6],  for 
the  3-colouring  graph  problems  to  alter  the  population. 


3.4  Evaluation  Function 

Different  fitness  functions  for  the  CSP  have  been  defined  in  the  CSP  literature, 
for  example,  Eiben  [5]  proposed  for  the  N-Queens  problem  a  minimization  of  the 
number  of  unsatisfied  diagonal  constraints.  For  the  graph  colouring  problem  [5], 
[9]  proposed  to  minimize  the  number  of  violated  constraints.  Thorton  [14]  pro¬ 
posed  to  minimize  Y  =  where  di  is  the  normalized  error  for  the  constraint 
i. 

However,  few  researchers  have  take  into  account  the  constraints  network  struc¬ 
ture.  Inspite  of  that  it  has  been  very  important  in  the  research  of  conventional 
search  techniques  for  CSPs.  In  a  recent  research  Dozier  [4]  proposed  an  evaluation 
function  which  determines  an  individual’s  fitness  by  subtracting  the  weights  of 
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all  violated  breakouts  from  the  number  of  constraints  satisfied  by  the  candidate 
solution  that  the  individual  represents.  To  consider  only  the  number  of  violated 
constraints,  implicitly  means  that  a  preference  between  the  constraints  doesn’t 
exist,  and  it  doesn’t  take  into  account  the  number  of  variables  in  a  constraint. 
Intuitively  a  first  preference  may  be:  The  more  important  constraints  are  those 
that  involve  more  variables. 

In  a  binary  constraint  network  we  will  have  every  constraint  with  the  same  pref¬ 
erence.  In  this  point  we  incorporate  another  concept,  that  is,  involved  variables. 

-  Definition  2: 

We  shall  now  define  the  sets  79 a  E  V,a  =  1, . . . ,  NC  roughly  speaking  'da  will 
contain  variables  involved  in  constraints  that  are  unsatisfied.  More  precisely 
Xj  e  'da  if  and  only  if  one  of  the  following  conditions  is  satisfied: 

•  Xj  appears  in  the  constraint  Ca  (i.e.i?[a,  j]  =  1),  and  Ca  is  not  satisfied. 

•  For  some  l^Xi  appears  in  the  constraint  Ca  =  1)  and  in  the 

constraint  Cp  {R[0,l\  =  1),  Xj  appears  in  the  constraint  Cp  iR[0J]  = 
1),  and  Ca  is  not  satisfied. 

This  definition  shows  that  there  are  variables  whose  values  may  involve  more 
than  one  constraint,  more  precisely,  the  effect  of  change  the  value  of  one  variable 
would  be  reflected  in  other  constraints.  It  is  this  effect  that  we  have  incorpo¬ 
rated  in  the  evaluation  function.  To  define  our  evaluation  function  the  following 
definition  is  necessary: 

—  Definition  3:  The  following  be  given: 

R  constraint  matrix(Def  1),  Ca  the  unsatisfied  constraint  that  contains  the 
variables  Xk  and  Xi  k]  —  R[a,  /]  =  1). 

The  Error-evaluation  EC  a  for  Ca,  is  defined  as 

EC  a  =  Number  of  variables  in  da  (Def  2),  i.e. 

EC  a  =  (Number  of  variables  in  Ca)  +  (Propagation  Effect  Xk  and  Xi) 

where  Propagation  Effect  Xk  and  Xi  in  a  binary  constraint  network,  is  de¬ 
fined  as  the  number  of  constraints  Cp,  ^  —  l,...,iVC,/?  ^  ot  that  include 
either  Xk  or  Xi  (i.e.  R\P,l]  —  1  or  R[l3yk]  =  1). 

The  Error-evaluation  for  an  unsatisfied  constraint  Ca  can  be  represented  in 
terms  of  the  constraint  matrix  R  as: 


(n  \  /  NC  NC 

+  E  E 

j^l  ) 

If  C.y  is  satisfied  then  we  define  EC^  =  0 

In  order  to  make  Definition  3  a  bit  clearer,  suppose  that  we  try  repairing  a 
chromosome  which  satisfies  all  except  the  constraint  C*.  We  must  modify  the 
values  instantiated  of  Xk  and/or  Xi,  this  change  of  genes  has  an  effect  which 
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will  be  propagated  to  other  connected  variables  by  constraints  in  the  network  to 
Xk  and/or  Xi. 

Now  we  extend  this  definition  to  n-ary  constraints  network.  The  idea  is:  if  we 
have  a  chromosome  whose  alleles  unsatisfy  a  constraint,  in  order  to  repair  it,  in 
the  worst  case,  we  must  change  both  all  variables  values  participating  in  this 
constraint,  and  the  variables  values  connected  to  them  by  the  others  constraints 
in  the  network  .  So,  the  Propagation  Efiect  uses  the  same  idea  that  for  binary 
constraints  network,  that  is: 

—  Definition  4: 

The  following  be  given: 

R  constraint  matrix(Def  1),  the  unsatisfied  constraint  that  contains  the 
variables  Xa,,,  z  G  [1,  n  (i2[a,  ki]  =  1  Vz). 

The  Error-evaluation  n-ary  ECa„  for  is  defined  as 

ECa„  =  (Number  of  variables  in  <7^)  -h  (Propagation  Effect  X^-Vz) 

where  Propagation  Effect  X^.  in  a  n-ary  constraint  network,  is  defined  as 
the  number  of  variables  in  the  constraint  Cp,  (3  =  l,...,iVC,  a  which 
includes  Xk^  or  X^^  ..  or  X^,  (i.e.  ki]  =  1  or  k^]  =  1  ...  or  R[I3,  h] 

The  Error-evaluation  n-ary  for  an  unsatisfied  constraint  Ca  can  be  repre¬ 
sented  in  terms  of  the  constraints  matrix  R  as: 

(n  r  Arc  /  n 

If  C-y  is  satisfied  then  we  define  EC^  =  0 

Finally,  our  fitness  function  is  the  sum  of  the  Error-evaluations  of  the  con¬ 
straints,  that  is: 

-  Definition  5; 

Given  the  constraint  matrix  R  (Def  l),and 

EC  a  the  Error-evaluation  for  constraint  (Def  3)  a  =  1, . . . ,  XC,  we  de¬ 
fine  an  Evaluation  Function  Z  for  a  binary  constraint  satisfaction  prob¬ 
lem  as: 

NC 

a=l 

The  goal  of  the  search  is  to  minimize  our  evaluation  function,  which  equals 
zero  when  all  constraints  are  satisfied. 

This  Evaluation  Function  can  be  viewed  in  a  binary  network  as  a  form  of  quan¬ 
tify  our  preference  for  the  chromosomes  whose  values  of  variables  satisfy  more 
arcs  and  paths. 
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Note:  In  the  same  way  for  a  n-ary  CSP:  Z  — 

For  instance,  suppose  us  randoms  CSP  caracterized  by  four  parameters:  n, 
the  number  of  variables,  m  the  number  of  values  in  each  variable’s  domain;  pi, 
the  probability  that  there  is  a  constraint  between  a  pair  of  variables,  and  ^2,  the 
conditional  probability  that  a  pair  of  values  is  inconsistent  for  a  pair  of  variables, 
given  that  there  is  a  constraint  between  the  variables.  We  can  see  that  our  fitness 
function  depends  stronger  of  p\  than  the  common  fitness  function  of  Number  of 
constraints  violated,  that  is,  given  NCy  =  Number  of  constraints  violated, 
the  value  expected  of  Z  in  a  random  CSP  is:  2NC-vPi{n  —  1).  We  can  see  that  in 
the  Figure  4  and  Figure  5. 


Fig.  4.  Common  fitness  function  v /s  pi  and  p2 


3.5  Selection  Algorithm 

The  fitness  function  allows  us  to  define  a  new  selection  algorithm.  First  of  all 
the  best  chromosome  of  a  generation  is  selected  for  the  next  generation,  elitist 
approach.  We  wish  to  privilege  the  chromosomes  with  lower  fitness  more  impor¬ 
tantly  that  the  standard  selection  method  of  roulette  wheel,  [8].  However  the 
illegal  individuals,  in  our  algorithm,  will  also  have  a  probability  to  be  selected, 
because  stated  legal  individuals  often  require  the  production  of  illegal  individuals 
as  intermediate  structures,  [10]. 

We  have  designed  the  selection  strategy  of  Figure  6. 

In  selecting  a  and  (3  we  compared  for  different  constraints  networks  the 
number  of  generations  required  to  find  a  solution.  The  best  results  have  been 
obtained  with  a  =  0.5  and  /?  =  0.85. 
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Fig.  5.  New  fitness  function  v/s  pi  and  p2 

Begin  /*  Selection  Algorithm  SA-I  */ 

Choose  j  from  U[0, 1] 
if  (j  <  a)  then 

Choose  chromosome  with  fitness <=  average-fitness 
else 

if  (j  <  /3)  then 

Choose  chromosome  with  fitness  <  average-fitness+Standard-Desviation 
else 

Choose  chromosome  from  U[l,population^ize] 
endif 
endif 

End  /*  Selection  Algorithm  SA-I  */ 


Fig.  6.  Selection  Algorithm 


Statistic  Properties  of  SA-I  Considering  the  values  of  a  and  (3  the  population 
has  been  divided  in  three  regions  A,B,C.  It  is  shown  in  the  Figure  7.  Suppose 
that  we  have  a  population  size  N,  with  ni  chromosomes  in  region  A,  ?i2  in  region 
B  and  713  in  region  C,  such  ni  -f  712+713  =  AT,  then  we  have  the  following  selection 
probabilities  for  a  chromosome: 

-  from  region  A  =  o:  +  (/?  -  a)  *  +  (1  -  /?)  *  ^ 

-  from  region  B  =  (/?  -  a)  *  ^ 

-  from  region  (7  =  (1  —  ^ 

Figure  7  shows  that  we  have  a  preference  for  the  chromosomes  in  the  region  A, 
i.e.,  we  prefer  individuals  whose  fitness  function  is  lower  than  or  equal  to  average. 
However, there  exists  the  probability  of  selecting  chromosomes  from  region  C, 
whose  fitness  function  is  greater  than  average +standard  deviation- 


Fig.  7.  Selection  regions 


4  An  example:  Graph  Colouring  Problem 

In  order  to  ilustrate  our  evolutionary  algorithm  suppose  that  we  consider  a  small 
graph  colouring  problem,  subject  to  the  restriction  that  adjacent  nodes  must  be 
colored  differently. 

The  graph  is  shown  in  Figure  8  and  it  consists  of  seven  variables  and  eleven 
constraints.  In  coloring  the  graph,  we  can  use  the  three  colors  red,  white  and 
blue. 


Fig.  8.  Graph  example 


4.1  First  Analysis 

This  graph  in  particular  would  be  reduced  concentrating  our  attention  in  the 
constraints  between  nodes  2,4,6.  by  applying  the  reduction  algorithm  proposed 
by  Cheeseman  [2].  If  we  have  a  consistent  instantiation  for  2,4,6,  it  is  easy  to 
find  a  value  which  satisfies  all  constraints,  for  the  other  variables.  However,  in 
order  to  ilustrate  our  algorithm,  the  search  has  been  realized  with  all  nodes 
considering  the  new  fitness  function.  With  our  fitness  function  we  are  given  a 
preference  hierarchy  for  constraints  to  be  satisfied.  For  example,  the  constraint 
Cl  between  node^  and  nodeQ  is  more  important  to  satisfy  than  constraint  Ciq 
between  nodes  and  nodee.  Both  involve  two  variables.  Analysing  the  network 
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structure,  node^^  that  is  relevant  to  Ci,  is  strongly  connected  (nodes  0,5,1),  on 
the  other  hand  nodes ^  that  is  relevant  to  Cio,  is  only  connected  to  node4. 


4.2  Results 

For  this  first  set  of  tests  we  have  worked  with  the  classical  mutation  and  crossover 
genetic  operators.  This  example  is  almost  trivially  simple,  of  course;  the  thing 
to  note  is  that  the  structure  of  the  constraints  network  is  very  important  to  im¬ 
prove  the  efficiency  of  the  search.  Further  we  have  tested  for  the  same  problem 
different  orders  for  the  variables  in  the  chromosome.  For  that  different  topologies 
with  7  variables  and  11  constraints  have  been  tested.  For  all  of  them  30  initial 
different  orders  of  variables  in  the  chromosome  have  been  generated.  The  max¬ 
imum  number  of  generations  has  been  fixed  in  100.  Figure  9  shows  the  graph 
of  the  performance  of  FCA.  The  x-axis  represents  the  variables  orders  and  the 
y-axis  represents  the  number  of  generations  for  each  order.  We  have  observed 
the  existence  of  orders  that  lead  to  performance  degradation. 

This  analysis  suggests  that  a  specialized  operator  of  the  type  "permutation” 
should  also  help  to  increase  the  search  performance.  Intuitively  a  permutation 
operator  is  justified  by  the  high  degree  of  relation  between  the  variables  in  the 
CSP.  For  example  an  order  could  become  not  good  when  the  constraints  are 
unsatisfied  by  neighbour  variables  in  the  chromosome,  because  it  is  difficult  to 
break  their  union  for  the  crossover  operator.  To  understand  precisely  that  means 
that,  we  consider  the  important  concept  of  a  schema  [8].  A  schema  is  a  similarity 
template  describing  a  subset  of  chromosomes  with  similarities  at  certain  chro¬ 
mosome  positions.  If  we  have  a  chromosome  with  m=4  variables  an  example  of 
schema  could  be  *  8  2  *,  where  is  a  don’t  care  symbol.  We  are  interested  in  the 
chromosomes  whose  alleles  are  matched  by  the  schema,  i.e.,  in  every  chromosome 
with  the  second  variable  value  is  equal  to  8,  and  the  third  variable  value  is  equal 
to  2.  A  schema  Sch  has  the  following  properties;  schema  order  o(Sch)  and  defin¬ 
ing  length  S(Sch),  where  the  order  of  a  schema  is  the  number  of  fixed  positions 
and  the  dehning  length  is  the  distance  between  the  first  and  last  specific  chro¬ 
mosome  position.  The  schema  example  has  o(Sch)  =  2  and  6(Sch)  =3  —  2=1. 
Suppose  us  that  the  constraints  are  violated  by  the  values  8  and  2,  then  the 
chromosomes  which  are  matched  by  this  schema  are  not  wished,  because  these 
values  will  not  be  contained  in  any  solution.  We  know  in  the  schema  fundamental 
theorem  that  the  destruction  probability  in  one-point  crossover  of  a  schema  Sch 
is:  pd{Sch)  =  .  In  other  words,  it  is  the  probability  that  after  of  crossover 

the  next  generation  will  not  have  chromosomes  matched  by  this  schema.  A  per¬ 
mutation  operator  will  permit  to  modify  the  value  of  6 (Sch)  by  changing  the 
variable  positions  in  the  chromosome,  in  consequence,  the  probability  of  destruc¬ 
tion  will  be  also  altered. 

The  permutation  operator  goal  is  to  increase  the  crossover  potentiality. 

Permutation  Operator  The  permutation  operator  will  be  only  actived  when 
we  realize  that  the  order  chosen  is  not  good,  because  if  a  good  order  has  been 
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Fig.  9.  Network  constraint:  7  variables  and  11  constraints 


chosen  the  algorithm  naturally  will  converge  without  additional  help.  However, 
if  the  order  chosen  is  a  bad  order,  the  algorithm  will  need  a  help,  that  is  our 
new  operator,  to  converge  faster  to  the  solution  (if  that  exists).  We  introduce  an 
other  parameter,  that  is,  a  permutation  probability  which  basically  works  with 
the  same  idea  that  the  mutation  and  crossover  probabilities.  It  is  the  probability 
that  the  population  will  change  the  order  of  the  variables  in  the  chromosome. 
The  principal  difference  with  the  others  operator  probabilities  is  that  all  the 
population  is  concerned  in,  i.e.,  if  we  decide  to  apply  the  permutation  operator 
it  will  affect  to  each  member  of  the  population,  moreover  in  the  same  way.  If  the 
permutation  operator  has  not  been  actived,  the  permutation  probability  is  zero. 
In  order  to  identify  if  an  initial  order  needs  to  active  the  permutation  operator, 
we  require  the  following  notion  of  stability. 


—  Definition  6: 

An  order  is  experiencing  stability  if  the  algorithm  has  found  the  same  best 
chromosome  in  the  last  S  generations. 

Figure  10  shows  when  our  permutation  operator  could  be  activated  during  the 
evolution  process.  The  structure  of  permutation  procedure  is  shown  in  the  Fig¬ 
ure  11.  Once  the  permutation  operator  is  actived  the  evolution  process  continues 
applying  it  according  to  the  permutation  probability  in  every  new  generation. 
We  have  included  this  operator  in  our  algorithm  FCA.  It  has  been  tested  with 
5  =  25  for  the  CSPs  analyzed  in  the  section  4.  A  permutation  probability  equal 
to  0.3  has  been  used.  The  results  show  an  increasing  of  search  performance  at 
least  20%  for  the  worst  initial  orders,  while  for  the  best  initial  orders  the  perfor¬ 
mance  is  the  same. 
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Fig.  10.  Activation  of  the  genetic  permutation  operator 

Begin  /*  permutation  operator  */ 
if  permutation  operator  is  active 

Generate  a  random  number  r  from  the  range  [0..1] 
if  r  <probability  of  permutation 

Generate  a  random  chromosome  new  order  for  [1. .Number-of -variables- 1] 
for  all  the  chromosomes  in  the  population 

Change  the  place  of  the  variable  values  according  to  the  new  order 
end  for 
end  if 
end  if 

End  /*  permutation  operator  */ 


Fig.  11.  Structure  of  permutation  procedure 


4.3  Comparison 

We  have  also  generated  1000  random  CSPs  with  different  topologies  with  a 
degree  of  connectivity  between  [4, 6]  for  30  variables.  We  have  compared  three 
algorithms  which  differ  in  fitness  functions  and  in  selection  algorithms  (Figure 
12).  All  of  them  work  with  a  good  specialized  asexual  operator  defined  by  Eiben 
in  {cf.  [6]}  as:  The  heuristic  asexual  operators  are  based  on  the  idea  of  improving 
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an  individual  by  changing  some  of  its  genes.  An  asexual  operator  selects  a  number 
of  positions  in  the  parent,  then  selects  new  values  for  these  positions.  The  number 
of  modified  values,  the  criteria  for  identifying  the  position  of  the  values  to  be 
modified  and  the  criteria  for  defining  the  new  values  for  the  child  are  the  defining 
parameters  of  the  asexual  operators.  Therefore  an  asexual  operator  is  denoted 
by  the  triple  (n,p,g)  where  n  indicates  the  number  of  modified  values,  and  the 
values  for  p  and  g  are  chosen  from  the  set  {r,b},  where  r  indicates  uniform 
random  selection  and  b  indicates  some  heuristic-based  biased  selection.  For  this 
kind  of  problems  the  best  parameters  for  this  operator  were  (n,p,g)={#,r,b) 
where  #  meaning  that  the  number  of  values  to  be  modified  is  chosen  randomly 
but  is  at  most  1/4  of  all  positions. 


Algorithm 

FNnoM  Function 

Selection  algorithm 

Common_ff 

Number  of  constraints  violatod 

Roulette  wheel 

Now_ff 

Fitness  of  Definition  5 

Roulolto  wheel 

Now_(f  +  new_8elect 

Fitness  of  Definition  5 

SA-I 

Fig.  12.  Three  algorithms:  Common_ff,  New_ff  and  New_fF  -|-  new__select 


For  each  connectivity  we  have  generated  100  different  random  graphs.  Figure 
13  shows  the  percentage  of  solutions  found  by  the  three  algorithms.  The  new 
fitness  function(New_ff)  with  SA-I  give  the  best  results.  It  has  found  in  the 
worst  case  a  70%  of  solutions,  in  contrast  to  Common_ff  which  in  the  worst 
case  found  20%  of  solutions.  Figure  14  shows  the  average  generations  for  each 
graph  connectivity.  The  number  of  average  generations  for  Common  _fF  is  greater 
than  for  the  other  algorithms.  Furthermore  we  can  observe  that  the  New_fF  is 
best  when  it  uses  the  selection  algorithm  SA-I. 

5  Conclusion 

A  new  evolutionary  algorithm  has  been  presented  which  repares  pre-solutions 
to  find  a  solution  of  a  CSP.  It  takes  into  account  the  structure  of  constraints 
network,  in  order  to  define  a  better  evaluation  function  for  CSP.  This  evaluation 
function  has  been  used  to  construct  a  selection  algorithm  that  strongly  privi¬ 
leges  the  better  individuals,  however  it  doesn’t  avoid  the  production  of  illegal 
individuals.  Our  research  allows  us  to  conclude  that  the  structure  of  a  constraints 
network  is  very  important  to  guide  the  search.  Furthermore  the  order  of  nodes  in 
the  chromosome  for  the  problems  with  high  degree  of  interaction  (epistasis),such 
as  the  CSPs,  can  lead  to  degradation  of  performance  when  the  algorithm  uses  a 
crossover  operator  to  alter  the  population.  This  CSP  caracteristic  has  been  also 
considered  to  design  a  new  operator  “permutation”,  which  is  actived  when  the 
initial  order  of  variables  is  not  good.  It  has  permitted  to  reduce  in  average  20% 
the  generations  number  for  the  worst  orders. 
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Fig.  13.  Percentage  of  solutions  found  by  Common  fitness, New  fitness  and  New  fitness 
with  new  selection  algorithm  for  different  connectivities 


Average  generationa  v/m  Connectivltv 


Fig.  14.  Comparison:  Common  fitness,  New  fitness  and  New  fitness  with  new  selection 
algorithm  for  different  connectivities 


There  are  a  variety  of  ways  in  which  the  techniques  that  we  have  presented  can 
be  extended.  The  principal  advantage  of  our  method  is  that  it  is  general,  i.e.,  the 
approach  to  estimate  the  evaluation  function  is  not  related  to  a  particular  CSP. 
Now  our  research  is  directed  towards  defining  better  genetic  operators  which 
consider  the  structure  of  the  constraints  network. 
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Abstract.  In  this  paper  we  introduce  the  concept  of  existential  variables 
in  finite  domain  constraint  problems.  A  variable  is  existential  if,  once  in¬ 
stantiated  the  other  variables,  one  can  always  find  a  value  for  it  such  that 
all  constraints  are  satisfied.  In  other  words,  existential  variables  (and  the 
constraints  they  are  connected  to)  do  not  add  any  information  to  the  rest 
of  the  constraint  problem.  We  use  the  notion  of  existential  variable  to 
achieve  what  we  call  incremental  k- consistency^  which  means  that  differ¬ 
ent  levels  of  (strong)  consistency  are  obtained  on  different  subparts  of  the 
problem,  but  all  lower  than  or  equal  to  k.  At  the  end,  the  overall  problem 
can  be  solved  by  a  backtrack-bounded  search,  and  the  complexity  ofdhe 
search  will  be  as  if  (or  smaller)  the  CSP  were  k-consistent  everywhere. 
We  also  consider  (l,k)-consistency,  which  is  a  form  of  local  consistency 
which  is  more  powerful  than  arc-consistency  while  still  remoying  just 
domain  elements  (and  thus  never  adding  any  new  constraint),  and  we 
discuss  how  to  get  a  similar  algorithm  also  for  this  form  of  consistency. 


1  Introduction 

Finite  domain  constraint  problems  (CSPs)  [Mon74,  Mac92]  are  a  very  power¬ 
ful  knowledge  representation  formalism.  In  fact,  many  real-life  situations  can 
be  easily  cast  as  CSPs.  However,  sometimes  it  is  not  easy  to  model  a  real-life 
problem  as  a  CSP,  and  one  of  the  results  is  that  some  level  of  redundancy  is 
introduced  during  the  modelling  phase.  This  can  happen,  for  example,  while 
defining  a  constraint,  and  specifying  too  many  tuples  as  allowed  by  it,  or  also 
while  using  variables  and  constraints  that  in  reality  do  not  add  any  information 
to  the  rest  of  the  problem.  The  first  kind  of  redundancy,  which  may  be  called 
constraint  redundancy^  can  be  dealt  with  by  the  local  consistency  algorithms 
[Mon74,  Mac77,  MF85,  DP88,  Dec92],  whose  task  is  exactly  that  of  identifying, 
and  removing,  some  tuples  in  some  constraints  which  are  not  allowed  by  the  rest 
of  the  problem.  The  second  type  of  redundancy  can  be  called  variable  redun¬ 
dancy,  and  it  is  the  issue  we  consider  in  this  paper.  That  is,  we  are  interested 
in  understanding  how  we  can  recognize  (at  least  some  of)  these  redundant  vari- 
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ables  (which  we  call  existential  variables),  and  how  we  can  use  this  information 
to  solve  CSPs  in  a  more  efficient  way. 

More  precisely,  a  variable  is  existential  if,  once  instantiated  (some  or  all  of) 
the  other  variables,  one  can  always  find  a  value  for  it  such  that  all  constraints 
are  satisfied.  This  means  that  any  solution  of  a  subpart  of  the  CSP  (the  part 
without  existential  variables)  can  always  be  extended  to  get  a  solution  of  the 
entire  CSP  without  too  much  effort.  In  fact,  the  extension  will  only  need  the 
time  to  try,  for  each  of  the  existential  variables,  at  most  all  the  values  in  their 
domains.  Thus  it  is  linear  in  the  number  of  such  variables,  with  a  multiplicative 
factor  which  includes  the  size  of  their  domain  and  their  degree. 

We  consider  a  sufficient  condition  for  a  variable  to  be  existential,  and  we 
exploit  it  to  develop  more  efficient  local  consistency  algorithms.  More  precisely, 
if  a  variable  is  connected  to  a  number  of  constraints  which  is  smaller  than  the 
level  of  consistency  of  the  problem,  then  it  is  existential.  In  fact,  by  definition  of 
k-consistency  [Pre78,  Pre88],  once  k-l  variables  are  instantiated  in  a  compatible 
way,  any  k-ih  variable  can  also  be  instantiated  compatibly  to  them. 

We  also  observe  that  the  subCSP  not  containing  the  variable  discovered  as 
existential,  say  P,  is  still  strong  k-consistent  if  the  entire  CSP  was  so.  This  makes 
it  possible  to  recognize  as  existential  also  variables  that  do  not  have  a  degree 
smaller  than  the  level  of  local  consistency  if  looking  at  the  entire  problem,  but 
do  have  it  if  looking  at  the  subCSP  P.  In  this  way,  more  and  more  variables 
can  be  defined  as  existential,  untill  all  remaining  variables  have  a  degree  greater 
than  or  equal  to  the  level  of  consistency. 

Therefore,  if  a  given  CSP  is  k-consistent,  then  a  certain  set  S  of  variables  will 
be  found  to  be  existential.  This  set  can  be  obtained  by  considering  all  variables 
of  degree  less  than  k,  then  those  variables  which  get  a  degree  with  this  property 
after  removing  the  first  ones,  and  so  on.  Now,  if  we  want  to  find  a  solution  of  the 
entire  CSP,  it  is  enough  to  first  find  an  instantiation  of  the  variables  in  V  —  S 
(if  V  is  the  set  of  all  variables)  which  satisfies  all  constraints  among  them,  and 
then  an  instantiation  of  the  existential  variables  which  . satisfies  all  constraints. 
The  first  phase  can  be  exponential  in  the  cardinality  of  >y  -  5,  while  the  second 
phase  is  linear  in  the  cardinality  of  S. 

However,  if  the  CSP  is  not  k-consistent,  to  get  the  same  complexity  we  must 
make  it  k-consistent,  which  in  general  takes  a  time  polynomial  in  the  number  of 
variables  of  the  CSP  (and  exponential  in  k).  Nevertheless,  since  not  all  existential 
variables  really  need  k-consistency  to  be  recognized  as  existential,  one  could 
achieve  lower  levels  of  consistency  on  some  parts  of  the  CSP  while  maintaining 
the  same  search  complexity.  In  this  way,  we  develop  an  algorithm  which  does 
not  achieve  k-consistency  over  the  whole  CSP,  but  only  on  those  parts  where 
it  is  needed.  The  kind  of  local  consistency  achieved  is  called  incremental  k- 
consistency.  The  resulting  set  of  existential  variables  however  is  greater  than  or 
equal  to  that  found  after  a  standard  k-consistency  algorithm. 

We  also  consider  weaker  sufficient  conditions  for  existentiality.  However,  they 
cannot  be  used  to  develop  more  efficient  algorithms  to  obtain  the  same  set  of 
existential  variables,  mainly  due  to  the  fact  that  such  weaker  notions  rely  on 
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local  properties  instead  of  global  ones. 

We  also  consider  (l,k)-consistency  [FYe88],  which  is  a  form  of  local  consistency 
which  is  more  powerful  than  arc-consistency  [Mac77]  while  still  removing  just 
domain  elements  (and  thus  never  adding  any  new  constraint),  and  we  give  some 
ideas  on  how  to  develop  an  algorithm  to  achieve  this  form  of  consistency  which 
follows  the  lines  of  incremental  k- consistency. 

The  paper  is  organized  as  follows.  Section  2  relates  our  work  to  the  results 
already  presented  in  the  literature.  Then,  Section  3  provides  the  necessary  no¬ 
tions  about  CSPs,  graphs,  and  local  consistency  that  will  be  needed  in  the  paper, 
and  Section  4  defines  the  concept  of  existential  variables.  Section  5  proposes  the 
algorithm  which  achieves  incremental  k-consistency  and  yields  the  same  search 
complexity  as  full  k-consistency.  Then,  Section  6  discusses  a  similar  algorithm 
for  (l,k)-consistency.  Finally,  Section  7  summarizes  the  results  of  the  paper  and 
discusses  topics  for  future  work. 


2  Related  Work 


The  concept  of  existential  variables  is  close  to  that  of  redundant  hidden  variables 
[Ros95].  In  CSPs  with  hidden  variables,  the  solution  concerns  only  a  subset  of 
the  variables,  the  visible  one.  All  other  variables  are  hidden.  Then,  some  of  the 
hidden  variables  can  be  found  to  be  redundant,  in  the  sense  that  their  removal, 
together  with  the  removal  of  the  constraints  connecting  them,  does  not  change 
the  solution  set  of  the  CSP.  Sufficient  conditions  similar  to  those  considered  in 
this  paper  also  hold  for  redundant  hidden  variables,  as  well  as  similar  algorithms 
to  find  them.  However,  in  a  CSP  with  hidden  variables,  this  concept  has  not  been 
used  to  make  the  preprocessing  phase  more  efficient. 

The  concept  of  width,  and  the  fact  that  a  CSP  whose  level  of  consistency  is 
greater  than  its  width  has  a  backtrack-free  search  [Pre88],  are  also  very  related 
to  our  work.  In  fact,  the  subCSP  spanned  by  the  existential  variables  can  be 
proved  to  have  a  width  smaller  than  the  level  of  local  consistency.  However,  our 
analysis  is  at  a  more  local  level,  since  we  consider  some  variables  at  a  time,  and 
not  the  whole  CSP. 

The  adaptive  consistency  algorithm  [DP88]  is  based  on  the  observation  that, 
to  get  a  backtrack-free  search  over  an  ordering  of  the  variables,  one  needs  to 
consider  each  variable,  from  the  last  one  in  the  ordering  to  the  first  one,  and 
achieve  directional  k-consistency  over  the  subCSP  spanned  by  this  variable  and 
its  neighbors  before  it  in  the  ordering,  if  they  are  A:  -  1.  Our  algorithm  instead 
does  not  obtain  a  backtrack-free  search,  but  a  backtrack-bounded  one  [Pre88], 
where  the  bound  is  given  by  the  number  of  non-existential  variables.  Thus  it  can 
be  seen  as  an  adaptive  consistency  algorithm  where  however  a  limit  is  put  on 
the  amount  of  preprocessing  (no  more  than  k-consistency). 
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If  a  tuple  {di , . . . ,  (in)  is  in  the  solution  of  a  CSP  whose  variables  are  vi , . . . ,  Vn , 
then  sometimes  we  will  write  it  as  the  set  of  equations  {ui  =  di,  . . . ,  Un  =  ^^n}- 
The  structure  of  a  CSP  can  be  easily  pictured  as  a  (hyper)graph,  which  is 
usually  called  a  constraint  graph  [DP88],  where  nodes  represent  variables  and 
hyperarcs  represent  constraints.  The  representation  of  a  CSP  by  a  hypergraph 
is  very  convenient,  because  many  notions  typical  of  (hyper)graphs  can  be  used 
in  the  CSP  context.  For  example,  in  this  paper  we  will  use  the  concept  of  degree 
of  a  node  (variable),  which  is  the  number  of  arcs  (constraints)  incidents  in  that 
node  (variable),  and  of  CSP  spanned  by  a  certain  set  of  variables. 

Definitions  (graph  and  related).  A  graph  is  a  triple  G  =  (A^, A,/),  where 
N  is  the  set  of  nodes,  A  is  the  set  of  arcs,  and  function  f  -  A  specifies 

which  nodes  are  connected  to  which  arc.  In  a  graph  G  =  (iV,  A,/),  consider  a 
node  neN,  The  degree  of  n,  written  degree(n),  is  defined  as  the  cardinality  of 
the  set  {n'  G  N  such  that  3a  G  A  with  (n,n')  G  /(a)^}.  Also,  given  a  subset  of 
the  nodes  N'  C  N,  we  define  the  graph  spanned  by  N'  as  G{N')  =  (iV',  A',/'), 
where  A'  =  {a  G  A  such  that  /(a)  C  iV'}  and  f  =  f\^, .  Given  a  CSP  P  = 
{V,  D,  C,  con,  de/),  its  constraint  graph  is  G(P)  =  {V,  C,  con).  □ 


3.2  Local  Consistency 

Local  consistency  algorithms  remove  from  a  CSP  some  domain  elements  or  also 
some  tuples  from  constraint  definitions  if  these  objects  are  found  to  be  inconsis¬ 
tent  with  some  other  object  in  the  CSP.  This  is  safe  (that  is,  it  does  not  change 
the  set  of  solutions  of  the  CSP),  because  local  inconsistency  implies  global  in¬ 
consistency,  and  thus  such  objects  would  never  appear  in  any  solution  of  the 
CSP.  However,  there  may  be  objects  (tuples  and/or  domain  elements)  which  are 
inconsistent  but  are  not  recognized  as  such  and  therefore  are  not  removed.  Thus 
in  general  only  local  consistency  is  achieved  (and  not  global  consistency,  which 
would  mean  that  the  problem  is  solved). 

The  first  local  consistency  algorithms  have  been  called  arc-consistency  [Mac77] 
and  path-consistency  [Mon74].  Later,  both  of  them  were  generalized  to  the  con¬ 
cept  of  k-consistency  [Pre88]:  a  CSP  is  k-consistent  if,  for  each  k-1  variables, 
and  values  for  them  which  are  allowed  by  all  the  constraints  involving  subsets 
of  them,  and  for  each  other  variable,  there  is  at  least  a  value  locally  allowed  for 
such  k-th  variable  which  is  compatible  with  the  values  chosen  for  all  the  other 
k-1  variables.  In  this  line,  arc-consistency  [Mac77]  is  just  2-consistency  and  path- 
consistency  [Mon74]  is  3-consistency.  Formally,  k-consistency  can  be  defined  as 
follows: 

Definition  4  ((strong)  k-consistency  [Fre88]).  A  CSP  (V,D,C,  con,  def) 
is  said  to  be  k-consistent  if,  for  all  k-1  variables  vi, . . .  ,Vk~i  €  V',  values 
d\ , . . . ,  dk — 1 

€  D,  and  k-th  variable  Vfc  G  V,  there  is  at  least  a  value  dk  ^  D  such  that, 
^  Here  we  mean  that  n  and  n'  appear  in  a  tuple  which  belongs  to  /(a). 
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3  Background 

Here  we  will  give  the  basic  notions  on  finite  domain  constraint  problems,  In¬ 
consistency,  graph  structure,  and  search,  that  will  be  useful  in  the  following  of 
the  paper. 


3.1  Finite  Domain  Constraint  Problems 

A  (finite  domain)  constraint  problem  (CSP)  consists  of  a  set  of  variables  rang¬ 
ing  over  a  finite  domain,  and  a  set  of  constraints  among  such  variables.  Each 
constraint  involves  a  subset  of  the  variables  and  specifies  a  set  of  tuples  of  val¬ 
ues,  for  such  variables,  which  satisfy  the  constraint.  A  solution  of  a  CSP  is  an 
instantiation  of  all  the  variables  such  that  all  the  constraints  are  satisfied. 

Definition!  (constraint  satisfaction  problem).  A  (finite  domain)  constraint 
satisfaction  problem  (CSP)  is  a  tuple  (P,  D,C,  con,  de/)  where 

-  K  is  a  finite  set  of  variables  (i.e.,  V  =  {^^i, . . .  ,t;n}); 

-  D  is  a  finite  set  of  values,  called  the  domain; 

~  C  is  a  finite  set  of  constraints  (i.e.,  C  =  {ci, . . . ,  c,n});  C  is  ranked,  i.e. 

^  =  Uik  such  that  c  €  Cfc  if  c  involves  k  variables; 

-  con  is  called  the  connection  function  and  it  is  such  that  con  : 
where  con(c)  —  (vi,. .  .^Vk)  is  the  tuple  of  variables  involved  in  c  € 

-  def  is  called  the  definition  function  and  it  is  such  that  def  : 

p{D^)y 

Given  a  CSP  P  =  (V,D,C,con,def),  consider  a  subset  V’  C  V.  Then  the 
subCSP  spanned  by  V'  is  CSP(P,V')  =  (V',V,C',con',def},  where  C'  is 
the  set  of  constraints  involving  only  variables  in  and  con'  and  def'  are  the 
restrictions  of  con  and  def  to  C'.  □ 

Function  con  describes  which  variables  are  involved  in  which  constraint,  while 
function  def  gives  the  meaning  of  a  constraint  in  terms  of  a  set  of  tuples  of 
domain  elements,  which  represent  the  allowed  combinations  of  values  for  the 
involved  variables.  Then,  the  solution  Sol{P)  of  a  CSP  P  =  (V,D,C,con,def) 
is  defined  as  the  set  of  all  instantiations  of  the  variables  in  V  (seen  as  tuples  of 
values)  which  are  consistent  with  all  the  constraints  in  C. 

Definition  2  (CSP  solution).  The  solution  Sol(P)  of  a  CSP  P  =  (V,  D,  C,  con, 
def)  is  defined  as  the  set  {(vi, . . . ,  such  that 

-  Vi  e  D  for  all  i; 

-  VcGC,  {t;i,...,^,,),_^^^€de/(c).  □ 

Here  we  assume  to  have  given  an  order  to  the  variables  in  V. 
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if  Vi  =  di  for  alH  =  1, . . . ,  A;  -  1  belongs  to  the  solution  of  the  CSP  {VKl  = 
{vi, . . . , Vfc-i},  -0, C\vKi^  con^cKli^^f\CKl^)y  Vi  =  foi alH  =  1,  •  • . ,  be¬ 
longs  to  the  solution  of  the  CSP  (VK  =  {vi, . . . ,  Vfc},  D,  C|v/c»con|c/c,  def\cK)- 
A  CSP  is  said  to  be  strong  k-consistent  whenever  it  is  j-consistent  for  all  j  <  k. 
□ 

There  are  many  ways  to  achieve  k-consistency  (for  example,  many  algorithms 
have  been  proposed  for  achieving  arc-  and  path-consistency).  However,  it  has 
been  proved  that  in  the  worst  case  a  k-consistency  algorithm  is  0{n^)  [FreSS], 
where  n  is  the  number  of  variables  of  the  given  CSP.  Thus,  if  k  is  much  smaller 
than  n,  such  algorithms  are  polynomial.  Also,  it  is  important  to  recall  that 
achieving  strong  k-consistency  may  involve  adding  constraints  with  at  most  k-1 
variables  [Pre78], 

Since  CSPs  are  NP-hard  problems,  they  are  usually  solved  via  a  backtracking 
search,  where  partial  assignments  are  extended  by  one  variable  at  a  time  while 
checking  the  satisfiability  of  the  subset  of  constraints  connecting  the  already 
assigned  variables.  Whenever  the  new  variable  cannot  be  assigned  to  any  value 
compatibly  to  the  constraints,  the  assignment  of  the  latest  variable  is  backtracked 
and  another  value  is  tried.  This  search  is  exponential  in  the  worst  case.  It  can 
however  be  improved  if  some  search  branches  are  cut  in  advance,  which  is  usually 
done  by  using  a  local  consistency  algorithm,  whose  aim  is  to  make  constraints 
stronger  while  not  changing  the  set  of  solutions,  or,  in  other  words,  to  remove 
redundant  tuples  from  the  constraint  definitions.  In  fact,  if  a  tuple  is  removed, 
then  some  failure  branches  are  cut  from  the  search  tree.  In  other  words,  the  role 
of  the  local  consistency  algorithms  is  to  obtain  a  gain  in  the  process  of  finding 
a  solution  via  a  backtrack  search  (that  is,  to  eliminate  some  of  the  trashing 
behaviour  of  such  search  [Mac77]). 

Sometimes,  the  pruning  achieved  by  the  local  consistency  algorithm  is  so 
much  that  the  subsequent  search  does  not  need  any  backtracking  to  find  a  solu¬ 
tion  of  the  given  problem.  For  example,  it  has  been  recognized  that  the  sparseness 
of  the  constraint  graph  and  the  level  of  consistency  of  the  CSP  have  a  strong 
relationship  with  the  fact  that  a  search-based  solution  algorithm  could  solve 
the  CSP  without  backtracking  at  all  or  with  a  bounded  amount  of  backtracking 
[Press].  More  precisely,  if  a  CSP  is  strong  k-consistent  arid  has  “width”  less  than 
k,  then  it  is  backt rack-free.  Here,  the  width  is  a  notion  related  to  how  sparse  the 
CSP  is:  the  sparser  it  is,  the  smaller  its  width  is. 

Definitions  (width  [Fre88]).  Given  a  CSP  and  its  constraint  graph,  consider 
the  ordered  graph  obtained  by  putting  an  order  on  the  nodes  of  the  constraint 
graph.  Then 

-  the  width  of  a  node  in  an  ordering  O  is  the  number  of  arcs  that  connect 

such  node  to  nodes  previous  to  it  in  the  order  O:  if  O  =  i;i , . . . ,  Un?  then  the 
width  of  Vk  in  O  is  the  number  of  arcs  a  such  that  {vi,Ujk}  C  con{a)  and 
i  6  A:- 1}. 

-  the  width  of  an  ordering  is  the  maximum  of  the  widths  of  the  nodes  in  that 
ordering. 
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-  the  width  of  a  constraint  graph  is  the  minimum  of  the  widths  of  all  the 
orderings  on  that  graph.  □ 

Theorem  6  (width,  k-consistency,  backtrack-free  search  [Pre88]).  Given 
a  CSP,  there  exists  a  backtrack-free  search  if  the  level  of  strong  consistency  of 
the  CSP  is  greater  than  the  width  of  its  constraint  graph.  □ 

4  Existential  Variables 

In  a  given  CSP,  an  existential  variable  is  a  variable  whose  removal,  together  with 
that  of  all  the  constraints  involving  it,  does  not  change  the  set  of  solutions  of 
the  CSP  spanned  by  all  variables  but  them.  More  precisely: 

Definition  7  (existential  variables).  Given  a  CSP  P  =  (V,  D,  C,  con,  de/) 
and  a  variable  a:  G  F,  x  is  an  existential  variable  for  a  P  if  and  only  if  Soli P)  \  - 

Sol{CSP{P,V -{x})).  □ 

Consider  for  example  the  CSP  in  Figure  1.  This  CSP  has  three  variables, 

X,  y,  and  2,  and  three  binary  constraints.  The  removal  of  variable  rr,  together 
with  all  the  constraints  involving  it,  does  not  change  the  solution  of  the  subCSP 
spanned  by  y  and  2:,  which  is  {?/  =  c,  z  =  6}.  Thus  x  is  existential  for  this  CSP^. 


Fig.  1.  Existential  variables. 

Existential  variables  are  very  important  in  CSPs,  since  they  allow  us  to  reduce 
the  task  of  solving  a  given  CSP  to  the  task  of  solving  an  equivalent  but  smaller 
CSP.  In  fact,  the  following  theorem  can  be  proved. 

Theorems.  Consider  a  CSP  P  =  (V,D,C,con,def),  an  existential  variable 
X  for  P ,  and  the  CSP  P*  —  C5P(P,  K  —  {x}).  Then,  taken  any  tuple  in  the 
solution  of  P',  there  is  a  way  to  extend  it  to  a  tuple  in  the  solution  of  P. 

Proof:  Were  it  not  possible  to  extend  the  tuple,  it  would  contradict  the  assump¬ 
tion  that  X  is  an  existential  variable  (which  means  that  5o/(P)[^_^^j  =  5o/(P')). 

The  same  is  also  for  y  and  2,  as  we  will  see  later. 
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The  above  theorem  can  be  extended  to  the  case  of  more  than  one  existential 
variable.  That  is,  if  we  have  a  set  S  of  variables  each  found  individually  to  be 
existential,  then  the  solution  of  the  entire  CSP  can  be  found  by  just  solving  the 
CSP  not  containing  any  of  the  variables  in  5. 

Theorem  9.  Consider  a  CSP  P  ~  {V^D^C^con^def),  a  set  S  of  existential 
variables  for  P,  and  the  CSP  P'  =  CSP{P^V  —  S).  Then,  taken  any  tuple  in 
the  solution  of  P' ,  there  is  a  way  to  extend  it  to  a  tuple  in  the  solution  of  P. 

Proof:  If  one  can  extend  a  longer  tuple  to  an  additional  variable,  then  afotiori  it 
is  possible  to  extend  a  shorter  tuple  to  the  same  variable,  since  a  smaller  number 
of  constraints  have  to  be  satisfied.  □ 

In  other  words,  if  we  assume  to  solve  CSPs  via  a  backtracking  search  process, 
existential  variables  may  be  left  at  the  end  of  the  instantiation  process,  since  we 
can  be  sure  that,  when  their  time  will  come  up  to  be  considered,  it  will  be  possible 
to  find  an  instantiation  for  them  compatible  with  the  already  assigned  values. 
Note  that  postponing  a  variable  means  in  practice  that  we  don’t  have  to  worry 
about  that  variable  any  more.  Thus  we  can  just  concentrate  on  the  solution  of 
the  remaining  problem,  that  is,  the  CSP  spanned  by  V  minus  5,  if  S  is  the  set 
of  existential  variables.  In  fact,  once  obtained  in  some  way  a  solution  of  such 
CSP,  we  can  be  sure  that  a  solution  of  the  given  problem  can  be  found  without 
too  much  additional  effort  (just  linear  in  the  number  of  existential  variables). 
This  means  that  we  have  reduced  the  problem  of  solving  the  given  CSP  to  the 
problem  of  solving  an  equivalent  CSP  with  a  smaller  search  space. 

Consider  now  a  CSP  which  is  strong  k-consistent,  and  consider  any  variable, 
say  X,  with  degree  smaller  than  k.  Then  it  is  easy  to  see  that,  if  all  variables 
except  X  have  been  instantiated,  then  there  exists  an  instantiation  of  x  such 
that  all  constraints  are  satisfied.  Actually,  since  the  CSP  is  k-consistent,  then 
instantiating  only  the  variables  connected  to  x  (and  not  all  variables  but  x), 
which  are  less  than  k  by  assumption,  is  enough  to  assure  us  that  x  has  a  possible 
instantiation.  Thus  x  is  an  existential  variable.  Moreover,  consider  the  CSP 
spanned  by  all  variables  but  x.  Then  such  CSP  is  still  strong  k-consistent,  so 
again  all  its  variables  of  degree  smaller  than  k  are  existential  variables.  And  so 
on  until  all  variables  have  degree  greater  or  equal  to  k.  Thus  the  discover  of  an 
existential  variables  may  lead  to  that  of  other  existential  variables  which  may 
not  appear  so  at  the  beginning. 

Theorem  10  (strong  k-consistency  and  existential  variables).  Consider  a 
strong  k-consistent  CSP  P  —  {V,D,C,  con,def),  and  any  of  its  variables,  say 
V  eV,  such  that  degree(v)  <k  —  l.  Then  v  is  existential  for  P. 

Proof:  To  show  that  v  is  existential  for  P  we  have  to  show  that  the  solution 
of  the  CSP  is  not  changed  by  the  removal  of  v.  Therefore  we  consider  Sol(P'), 
where  P'  =  {V’  ~  V  -  {v},D,C'  =  C  -  {c  \  v  £  con{c)),con\c>  ,def\c'). 
Obviously  Sol(P)  C  Sol(P'),  since  the  removal  of  some  constraints  can  only 
enlarge  the  set  of  allowed  tuples  for  the  remaining  variables.  Consider  now  any 
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tuple  f  e  Sol{P').  This  tuple  involves  all  involving  all  variables  in  V  —  {^;}. 
Let  us  now  see  if  t'  can  be  extended  to  v  while  satisfying  all  the  constraints  in 
P.  This  is  indeed  so,  since  P  is  strong  k-consistent,  and  degree(v)  <  k  —  1.  In 
fact,  consider  the  projection  of  t',  say  fp,  to  the  variables  connected  to  Vy  say 
ui , . . . , Ui,  with  z  <  k~l.  Then,  by  strong  k-consistency,  tp  can  be  extended  to  v 
while  satisfying  all  constraints  involving  u,  , . . . ,  yielding  tuple  tv.  Therefore 
tv  is  an  extension  of  t'  which  satisfies  all  constraints  in  P:  that  is,  t  is  in  the 
solution  of  P.  Thus,  Sol{P')  C  Sol{P).  As  a  result,  Sol{P)  =  Sol{P').  Therefore 
V  is  existential  for  P.  □ 

A  nice  application  of  Theorem  10  is  the  problem  of  coloring  a  graph  with 
k  colors.  In  fact,  it  can  be  proved  that  such  a  problem  is  k-consistent  [vBD94], 
and  thus  any  variable  with  degree  A:  —  1  or  less  is  existential. 

Theorem  11  (existential  variables  and  strong  k-consistency).  Consider  a 
strong  k-consistent  CSP  P  =  (V,  P,  C,  con,  def),  and  any  of  its  variables,  say  v  € 
V,  such  that  degree(v)  <  k  —  \.  Consider  also  the  problem  P'  =  CSP{V  —  {?^}). 
Then  P'  is  strong  k-consistent  as  well 

Proof:  Since  v'  C  V,  ail  variables  of  P'  are  also  in  P,  and  all  constraints  of 
P'  are  also  in  P.  Therefore  all  the  properties  of  subsets  of  such  variables  and 
constraints,  which  hold  in  P,  will  a  fortiori  hold  also  in  P'.  □ 

An  algorithm  that  postpones  variables  according  to  the  results  of  Theorem  10 
and  11,  called  Algorithm  1,  can  be  seen  in  Table  1.  This  algorithm  takes  a  strong- 
k-consistent  CSP  and  returns  a  partial  ordering  O  of  postponed  variables.  This 
ordering  is  partial  because  no  ordering  is  given  among  the  variables  postponed 
at  the  same  stage.  Thus  it  is  represented  as  an  ordered  list  of  sets.  If  the  returned 
ordering  is  O  =  Si,  5t_i, . . . ,  5i,  it  means  that  Si  is  the  firstly  discovered  set  of 
existential  variables  (which  thus  are  the  last  ones  in  the  ordering). 


Algorithm  1: 

Input:  a  strong  k-consistent  CSP  P  =  {V,  D,C,con,def). 

Output:  an  ordering  O. 

O:=0; 

1.  V'  =:{veV  such  that  degree(z;)  <  k] ,  P'  =  CSP{P,V  -  V’); 

2.  If  Pt^P'  then  P  :=  P\  0  :=V'.0,  goto  1 


Table  1.  Algorithm  1 
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<1,0> 


<i,o> 


<i,o> 

<o,i> 


<1,0> 

<0,1> 


Fig.  2.  Existent iality  and  variable  degree. 


As  an  example  of  the  application  of  algorithm  1,  consider  the  CSP  in  Figure 
2.  Assuming  that  the  domain  of  each  variable  contains  (or  it  is  represented  by  a 
unary  constraint  containing)  the  values  0  and  1,  this  CSP  is  2-consistent  but  not 
3-consistent,  since  there  is  no  way  to  extend  any  instantiation  of  any  two  variables 
among  x,  y  and  z  to  the  third  variable.  For  this  CSP,  the  algorithm  postpones 
only  variable  w,  which  has  degree  1,  since  no  other  variable  gets  a  degree  smaller 
than  2  after  w  is  removed.  Consider  now  the  CSP  in  Figure  1,  which  is  3- 
consistent.  Here  all  variables  have  degree  less  than  3,  thus  the  algorithm  would 
postpone  all  of  them,  meaning  that  the  whole  CSP  has  a  backtrack-free  search. 
Finally,  consider  the  CSP  in  Figure  3  (here  unary  constraints  are  denoted  by 
arrows  pointing  to  the  involved  variable).  This  problem  is  not  3-consistent,  but 
it  is  2-consistent.  In  the  first  iteration  the  algorithm  postpones  only  variable  w, 
since  this  is  the  only  variable  with  degree  1.  Then,  it  removes  also  since  v  gets 
degree  1  after  the  first  pass  of  the  algorithm. 

<l>  <l>  <l>  <l> 

<0>  <0>  <0>  <0> 


<0> 

Fig.  3.  A  CSP  with  two  existential  variables:  w  and  v. 


Now,  it  is  possible  to  prove  that  the  CSP  spanned  by  the  variables  we  post¬ 
pone  according  to  algorithm  1  in  a  strong  k-consistent  CSP  has  width  less  than 
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k.  Therefore,  our  result  could  be  seen  as  an  application  of  the  result  in  [Fre88] 
to  (dinamically  chosen)  subCSPs  instead  of  entire  CSPs. 

Theorem  12  (redundancy  and  width).  Given  a  strong  k-consistent  CSP  P, 
consider  the  CSP  P'  which  is  spanned  by  the  variables  postponed  by  P  according 
to  algorithm  1.  Then,  P'  has  width  less  than  k. 

Proof:  To  show  that  P'  has  width  less  than  k  it  is  enough  to  find  an  ordering 
with  width  less  than  k,  that  is,  where  all  the  nodes  are  connected  to  at  most  k-1 
previous  nodes.  Consider  then  the  ordering  O  which  is  returned  by  algorithm 

l,  In  O,  each  node  is  connected  to  at  most  k  —  1  nodes  which  are  later  in  the 

ordering,  otherwise  it  could  not  be  postponed  by  the  algorithm.  Therefore,  the 
reverse  of  ordering  O  has  the  desired  feature.  □ 

5  K-consistency  and  Existential  Variables 

From  the  previous  section  we  know  that,  if  a  CSP  is  k-consistent,  then  algorithm 
1  may  find  some  existential  variables,  by  iterating  the  process  of  considering 
those  variables  with  degree  smaller  than  k.  Then,  the  problem  of  solving  the 
entire  CSP  will  be  reduced  to  the  problem  of  solving  the  CSP  spanned  by  the 
remaining  variables  (those  not  recognized  as  existential).  Thus  a  backtrack  search 
will  backtrack  only  over  these  variables.  That  is,  if  ^  is  the  set  of  all  variables, 
V'  the  set  of  variables  found  to  be  existential,  and  D  the  variable  domain,  then 
the  worst-case  time  complexity  of  such  search  will  be  0(|  D  I  -f(|  D  |  x  | 

V'  D).  In  fact,  once  an  instantiation  for  the  variables  inV~V'  has  been  found 
(and  this  may  take  exponential  time),  we  are  sure  that  the  variables  in  V'  can 
be  compatibly  instantiated  without  backtracking  (and  thus  in  linear  time). 

However,  if  the  CSP  is  not  already  k-consistent,  then  k-consistency  has  to 
be  achieved,  with  a  complexity  which  is  in  general  0(|  V  |*^).  Here  we  propose  a 
more  convenient  way  to  achieve  the  same  search  complexity  (that  is,  exponential 
in  \  V  -  V'  \  and  linear  in  |  F'  |),  or  a  smaller  one. 

The  convenience  lies  in  the  fact  that  we  do  not  achieve  (strong)  k-consistency 
on  the  whole  problem,  but  just  on  a  subpart  of  it.  This  is  allowed  by  the  ex¬ 
ploitation  of  the  concept  of  existential  variables.  In  fact,  it  is  possible  to  observe 
that  not  all  variables  recognized  as  existential  by  algorithm  1  need  k-consistency 
to  be  so.  Consider  for  example  any  variable  with  degree  much  smaller  than  fc, 
say  z.  Then  such  variable  would  be  existential  also  in  a  (z  +  l)-consistent  CSP. 
Thus  obtaining  k-consistency  is  too  much  for  variables  like  this  one.  It  would  be 
enough  to  achieve  (z  -1-  l)-consistency. 

This  observation  immediately  leads  to  the  algorithm  in  Table  2,  which  ba¬ 
sically  achieves  only  the  necessary  level  of  consistency  in  the  parts  of  the  CSP 
where  this  is  needed.  Thus,  at  the  end,  different  parts  of  the  CSP  will  have 
different  levels  of  consistency,  depending  on  the  structure  of  the  graph.  More 
precisely,  the  sparser  the  subCSP  will  be,  the  smaller  its  level  of  consistency  will 
result. 
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Algorithm  2: 

Input:  a  CSP  P  =  (V,  D,  C,  con,  def). 

Output:  a  partial  ordering  O. 

O:=0; 

for  j  =  1  to  A:  do 

1.  achieve  j -consistency  on  CSP(P,V); 

2.  V'  ^  {v  e  V  snch  that  degree('i;)  <  j}\ 

3.  if  then  y  :=  V- V',  0:=0uy'.  goto  2 

endfor 


Table  2.  Algorithm  2 


In  reality,  the  statement  of  Theorem  10  could  be  weakened.  In  fact,  consider 
a  variable  x  which  has  degree  i,  and  consider  the  i  other  variables  it  is  connected 
to,  say  Vx  =  {^1, •  •  •  Then,  achieve  (i  +  l)-consistency  on  the  sub-CSP 
spanned  by  14  U  {x).  At  this  point  it  is  possible  to  see  that  the  removal  of  x 
does  not  change  the  solution  of  the  remaining  problem.  That  is,  a;  is  existential. 

Theorem  13.  Consider  a  CSP  P  =  (V,  D,  C,  con,  def),  and  any  of  its  variables, 
say  X  eV.  Consider  also  the  set  of  variables  connected  to  x  via  some  constraints, 
say  Vx,  and  the  problem  P'  —  CSP{P,Vx  U  {2;}).  If  P'  is  k-consistent  and  the 
degree  of  x  is  smaller  than  k,  then  x  is  existential  for  P. 

Proof:  By  definition  of  k  consistency,  once  the  variables  in  I4  have  an  instanti¬ 
ation,  it  is  possible  to  find  also  a  compatible  instantiation  for  x.  □ 

An  even  weaker  condition  can  be  obtained  by  considering  that  existential 
variables  are  put  at  the  end  of  the  search  ordering.  Thus  directional  k-consistency 
[DP88]  is  enough  to  make  a  variable  with  degree  less  than  k  existential.  In  fact, 
it  is  enough  to  assure  that,  once  the  variables  in  I4  have  been  instantiated,  x  can 
be  instantiated  as  well.  This  is  the  same  observation  that  lead  to  the  definition 
of  the  concept  of  adaptive  consistency  [DP88]. 

However,  both  these  weaker  sufficient  conditions  do  not  lead  to  any  improve¬ 
ment  in  algorithm  2,  since  this  algorithm  looks  for  existential  variables  only  after 
achieving  j-consistency.  Thus  we  cannot  recognize  the  suparts  where  to  achieve 
j-consistency,  because  we  don’t  know  where  we  will  find  the  existential  variables. 
This  is  due  to  the  fact  that  achieving  j-consistency,  as  noted  above,  may  add 
new  constraints  to  the  problem,  thus  modifying  the  degree  of  some  variable  (and 
thus,  possibly,  its  existent iality). 

Algorithm  2  does  not  make  the  CSP  k-consistent,  of  course,  because  k- 
consistency  is  achieved  only  on  the  part  of  the  CSP  where  all  variables  have 
degree  k  or  larger.  However,  the  complexity  of  the  search  process  is  the  same  (or 
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less)  as  if  we  achieved  k-consistency  on  the  whole  problem.  To  prove  that  the 
complexity  of  the  search  remains  the  same  or  decreases,  we  just  need  to  show 
that  the  variables  recognized  as  existential  by  the  two  methods  (k-consistency 
-I-  algorithm  1,  or  algorithm  2)  are  the  same,  or  that  those  postponed  by  the 
second  method  are  a  superset  of  those  postponed  by  the  first  one. 

Theorem  14.  Consider  a  CSP  P,  and  the  CSP  Pi  obtained  by  applying  a  k~ 
consistency  algorithm  to  P.  Consider  also  the  order  Oi  returned  by  algorithm  1 
on  Pi.  Then,  consider  the  partial  order  O2  returned  by  applying  algorithm  2  to  P. 
Let  Oi  =  SiSi-i .  ,.Si  and  O2  =  •  /i .  Then  we  have  that  |J/=i  Si  C 

. 

Proof:  Assume  a  variable  is  postponed  by  algorithm  1  (applied  after  a  k- 
consistency  algorithm).  It  means  that  such  variable  has  degree  <  k,  say  j.  Con¬ 
sider  now  the  same  variable  during  algorithm  2.  Since  obtaining  any  level  of 
consistency  lower  than  k  may  not  add  more  constraints  then  obtaining  k  consis¬ 
tency,  the  degree  of  such  variable  would  be  smaller  than  or  equal  to  j.  Thus,  at 
iteration  j  of  algorithm  2  at  the  latest,  such  variable  will  be  postponed. 

Consider  now  a  variable  which  has  degree  j  in  P,  with  j  <  k.  Then  it  is 
postponed  at  some  iteration  of  algorithm  2.  Consider  now  the  same  variable 
after  applying  a  fc-consistency  algorithm  to  P.  Since  achieving  k-consistency 
may  add  constraints  of  arity  A;  —  1,  this  variable  may  gain  k  —  I  new  neighbors, 
thus  getting  a  degree  j  k  ~  1,  which  is  greater  than  k  is  j  is  greater  than  1. 
Thus  such  variable  is  not  postponed  by  algorithm  1.  □ 

The  complexity  of  algorithm  2  depends  on  how  many  variables  are  discovered 
as  existential  at  each  of  the  k  iterations.  If  the  set  of  existential  variables  discov¬ 
ered  at  iteration  j  is  Vj  (thus  we  have  Vi  +  ...  +  Vk  =  V'),  then  the  complexity 
is  0(1  V\+{\V-Vi\f  +  i\V-Vi-V2\f  +  ...-^{\V~V'  1)^),  instead  of 
0(1  V  1^)  which  is  the  complexity  of  a  k-consistency  algorithm. 

A  special  case  of  the  use  of  algorithm  2  is  when  one  knows  that  a  CSP  is 
polynomially  solved  by  a  k-consistency  algorithm.  This  is  for  example  the  case  of 
simple  temporal  constraint  problems  (STCSPs)  [DMP89],  which  are  solved  by  3- 
consistency.  This  means  that,  after  one  has  achieved  3-consistency,  the  variables 
can  be  instatiated  compatibly  to  the  constraints  without  any  backtracking.  In 
this  case,  applying  algorithm  2  would  obtain  the  same  resulting  backtrack-free 
search,  although  with  a  restriction  on  the  order  of  the  instantiations. 

6  (l,k)-consistency  and  Existential  Variables 

The  concept  of  (i,j)-consistency  [Fre88]  is  a  generalization  of  that  of  k-consistency: 
a  CSP  is  (i  j)-consistent  if,  given  an  instantiation  of  i  variables  which  is  compat¬ 
ible  with  all  the  constraint  among  the  i  variables,  it  is  possible  to  extend  it  to 
other  j  variables  such  that  all  constraints  among  the  i  4*  j  variables  are  satisfied. 
Now,  as  achieving  k-consistency  may  add  constraints  of  arity  at  most  A:  —  1, 
achieving  (ij)-consistency  may  add  constraints  of  arity  at  most  i.  Adding  new 
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constraints  may  be  too  much  of  a  burden  on  the  space  and  time  requirement 
of  the  local  consistency  algorithm,  since  it  means  creating  new  data  structures. 
This  is  one  of  the  reasons  of  the  great  success  of  arc-consistency  (that  is,  2- 
consistency).  In  fact,  achieving  arc- consistency  may  involve  adding  constraints 
at  most  of  arity  1,  which  means  just  removing  elements  from  variable  domains. 
In  this  way,  no  new  data  structure  is  needed,  but  just  a  modification  of  an  al¬ 
ready  existing  data  structure,  the  domain  of  a  variable  (which  in  most  cases  is 
just  a  bit  vector). 

Therefore,  it  is  important  to  study  properties  and  behaviour  of  algorithms 
which  just  remove  domain  values.  Now,  it  is  obvious  that  the  class  of  (l,k)- 
consistency  algorithms  fits  in  this  category.  Notice  that  (l,l)-consistency  is  just 
2-consistency  (thus  arc-consistency),  while  (l,k)-consistency,  with  k  greater  than 
1,  is  obviously  more  powerful  than  2-consistency.  Thus  we  have  an  algorithm 
which  achieves  more  pruning  than  arc- consistency  but  still  removing  just  domain 
elements. 

As  for  k-consistency,  achieving  (i  j)-consistency  may  yield  a  great  gain  during 
a  subsequent  search  process  to  find  a  solution  of  the  given  CSP.  In  particular, 
there  is  a  relationship  between  the  amount  of  backtracking  to  be  done  during 
the  search  and  the  level  of  (l,k)-consistency  that  the  problem  has  (see  [Fre88]  for 
more  details).  Therefore,  it  is  natural  to  try  to  exploit  the  concept  of  existential 
variables  also  with  respect  to  this  kind  of  algorithms,  so  that  some  variables  may 
be  postponed  during  the  search.  In  particular,  it  is  possible  to  find  suflBcient 
conditions  similar  to  those  in  Section  4.  One  of  them  is  the  following  one. 

Theorem  15  ((l,k)-consistency  and  existential  variables).  Consider  a  CSP 
P  ~  (V,  Z),  C,  con,  de f)  which  is  ( Ijk)- consistent,  and  any  set  S  =  {xi , . . . ,  iCi}  C 
V  such  that  i  <k.  Assume  also  that  there  is  a  variable  x  ^  S  with  14^.  C  5u{x} 
for  all  j  =  I,. ..  ,i,  where  14^.  is  the  set  of  neighbors  of  xj.  Then  all  variables  in 
S  are  existential  for  P. 

Proof:  Similar  to  that  of  Theorem  10,  and  left  out  for  reasons  of  space.  □ 

The  above  theorem  basically  says  that  any  set  containing  less  than  k  variables 
which  are  either  connected  among  them  or  to  another  variable  a:  is  a  set  of 
variables  which  can  be  instantiated  without  any  backtracking  after  all  other 
variables  have  been  instantiated.  Actually,  only  the  instantiation  of  variable  x  is 
needed. 

Starting  from  this  theorem,  an  algorithm  similar  to  algorithm  2  may  easily 
be  derived,  so  that  a  search  complexity  smaller  than  or  equal  to  that  needed 
for  a  (l,k)-consistent  CSP  may  be  obtained  without  actually  computing  (l,k)- 
consistency  everywhere  in  the  CSP. 

7  Conclusions  and  Future  work 

We  proposed  an  algorithm  which  achieves  diflferent  levels  of  consistency,  all  lower 
than  or  equal  to  k,  on  diflferent  parts  of  a  CSP,  but  which  yields  a  subsequent 
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search  process  with  a  complexity  smaller  than  or  equal  to  that  one  would  have  in 
a  k-consistent  CSP.  That  is,  we  got  a  smaller  complexity  of  the  solution  search 
by  using  a  less  costly  algorithm  for  local  consistency. 

We  plan  to  experiment  with  our  algorithm  for  achieving  incremental  k- 
consistency  and  see  how  it  behaves  with  respect  to  standard  k-consistency  al¬ 
gorithms  (as  a  start,  we  will  consider  A;  =  3,  so  that  we  can  compare  it  to  the 
many  existing  algorithms  for  path-consistency). 

We  also  plan  to  investigate  the  relationship  between  constraint  tightness  and 
variable  existentiality,  following  some  recent  studies  on  the  relationship  between 
such  notion  and  backtrack-free  search  [vBD94].  In  fact,  our  conjecture  is  that, 
in  a  CSP  which  is  (m  -H  2)-consistent,  any  variable  which  is  involved  only  in 
constraints  with  tightness  smaller  than  or  equal  to  m  is  existential.  This  could 
be  combined  with  the  sufficient  condition  we  consider  in  this  paper  to  discover 
more  existential  variables  during  the  algorithm. 

We  also  plan  to  combine  this  work  with  that  on  CSPs  with  hidden  variables, 
so  that  in  a  CSP  with  both  visible  and  hidden  variables,  some  hidden  variables 
are  removed  because  found  to  be  redundant  [Ros95],  and  some  visible  variables 
are  postponed  because  found  to  be  existential. 
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Abstract.  This  paper  investigates  logical  characterizations  of  some  as¬ 
pects  of  concurrent  constraint  (cc)  computations.  It  contains  both  neg¬ 
ative  and  positive  results. 

We  show  that  intuitionistic  logic  enables  to  observe  the  so-called  stores 
of  a  concurrent  constraint  agent,  but  neither  its  successes  nor  its  suspen¬ 
sions,  even  in  the  monotonic  and  deterministic  case.  On  the  other  hand, 
IMALL  (intuitionistic  multiplicative  and  additive  linear  logic)  does  en¬ 
able  the  observation  of  successes  (but  not  that  of  suspensions):  we  con¬ 
sider  a  non-monotonic  and  non-deterministic  version  of  cc,  Icc,  and  we 
show  that  the  successes  of  an  Icc  computation  can  be  characterized  log¬ 
ically;  this  holds  also  for  cc,  since  cc  can  be  faithfully  translated  into 
Icc. 


Keywords 

Concurrent  constraint  programming,  intuitionistic  logic,  linear  logic. 


1  Introduction 

Concurrent  constraint  programming  cc  [21]  is  a  model  of  concurrent  computa¬ 
tion,  where  concurrent  agents  communicate  through  a  shared  store,  represented 
by  a  constraint,  which  expresses  some  partial  information  on  the  values  of  the 
variables  involved  in  the  computation.  An  agent  may  add  a  constraint  c  to  the 
store,  or  ask  the  store  to  entail  a  given  constraint  (c  ^  A).  Communication  is 
asynchronous:  agents  can  remain  idle,  and  senders  (constraints  c)  are  not  block¬ 
ing.  Computation  is  monotonic  (the  constraints  in  the  store  are  not  consumed): 
this  allows  to  provide  cc  with  a  denotational  semantics,  viewing  agents  as  closure 
operators  on  the  semi-lattice  of  constraints  [23]. 

Syntactically,  concurrent  constraint  programming  is  an  extent  ion  of  con¬ 
straint  logic  programming  [9,  14]  with  a  suspension  mechanism  c  A,  and  the 
operational  semantics  of  cc  is  the  same  as  that  of  constraint  logic  programming, 
except  for  c  — )■  A  which  blocks  until  the  amount  of  accumulated  information  (the 
store)  is  strong  enough  to  entail  c  (in  intuitionistic  or  classical  logic),  in  which 
case  c  — >■  A  evolves  to  A. 
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Besides,  Saraswat  and  Lincoln  have  proposed  a  non-monoionic  version  of  cc 
[22],  further  studied  in  [4,  25]  where  the  logic  of  constraints  is  linear  logic  [8]:  in 
this  version,  constraints  can  be  consumed,  and  the  language  is  therefore  closer 
to  process  calculi  like  Milner’s  7r-calculus  [17]. 

In  constraint  logic  programming  (with,  or  without  negation  by  failure,  or  con¬ 
structive  negation),  the  logical  nature  of  the  constraint  system  extends  to  the 
goals  and  program  declarations,  and  states  strong  connections  between  opera¬ 
tional  semantics  and  entailment  (in  classical  logic  or  3- valued  logic)  [14,  12,  24,  6]. 
For  instance,  success  constraints  (i.e.  final  states  of  computations)  can  be  ob¬ 
served  logically:  any  success  entails  the  initial  state  (modulo  the  completed  pro¬ 
gram  P*  and  the  constraint  system  C);  conversely  any  constraint  c  entailing  a 
goal  G  is  covered  (again  modulo  P*  and  C)  by  a  finite  set  of  successes  ci  ...Cn, 
i.e.  C  h  V(ci  ...Cn  =>  c).  Such  results  make  easier  the  design  and  understanding 
of  programs,  and  provide  useful  tools  for  reasoning  about  them. 

In  concurrent  constraint  programming,  the  situation  is  less  clear.  In  [13] 
Lincoln  and  Saraswat  give  an  interesting  connection  between  the  observation  of 
the  stores  of  cc  agents  and  entailment  in  intuitionistic  logic.  However  it  tells 
just  part  of  the  story  of  a  cc  computation:  for  instance,  it  does  not  say  anything 
about  eventual  suspending  agents.  Actually  let  a  success  of  an  agent  A  be  a 
store  c  such  that  A  evolves  to  c,  and  let  a  suspension  be  an  agent  B  =  c  A 
(d  A)  such  that  A  evolves  to  B  and  c  does  not  entail  d  (the  exact  definition 
is  slightly  longer):  in  section  3  we  shall  show,  through  counter-examples,  that 
the  observation  of  successes  or  suspensions  is  not  expressible  in  intuitionistic 
logic.  Roughly  speaking,  the  interpretation  of  cc  agents  as  intuitionistic  formulas 
stumbles  against  the  structural  rule  of  (left)  weakening: 


r\-  B 
r,A\-  B 


Girard’s  linear  logic  is  a  fine  proof-theoretical  study  of  the  weakening  and 
contraction  structural  rules  of  classical  and  intuitionistic  logics.  While  moving 
to  linear  logic,  it  is  very  natural  to  move  to  a  non-monotonic  version  of  cc  at 
the  same  time.  This  has  been  done  by  Saraswat  and  Lincoln  in  a  higher-order 
setting.  Here  we  define  a  first-order  non-monotonic  variant,  Icc,  for  which  we 
distinguish  successes  and  suspensions  (section  4),  and  we  prove  a  completeness 
result  on  successes:  we  show  that  the  successes  of  an  Icc  computation  can  be 
characterized  in  IMALL  (intuitionistic  multiplicative  and  additive  linear  logic) 
(section  5),  whereas  the  suspensions  cannot  (section  6).  We  show  that  cc  can  be 
faithfully  translated  into  Icc,  so  this  result  holds  also  for  cc. 

Finally  we  discuss  in  section  6  the  limits  of  the  correspondence  between  linear 
logic  and  Icc  computations,  and  its  signficance  for  linear  logic  and  concurrency. 
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2  Monotonic  cc 


A  monotonic  constraint  system  is  a  pair  (C,lhc),  where:  C  is  a  set  of  formulas 
(the  constraints)  built  from  a  set  V  of  variables,  a  set  U  of  function  and  relation 
symbols,  and  logical  operators  T  (true)^  A  and  3;  and  ihcC  (J*  x  C.  We  assume 
A  has  neutral  T.  Instead  of  ((ci . .  .Cn),  c)  Gibe,  we  write  ci . .  .c„  Ihc  c. 

he  is  the  least  reflexive  relation  G  C*  x  C  containing  Ihe  and  closed  by  the 
following  rules  of  intuitionistic  logic: 


r,  d,  d  h  c  _ 

r,d\-c  ~ 
■The 
r\-3xc 

r  h  Cl  r  h  C2 
r*  h  Cl  A  C2 


jThc _  r,c\-  d  The 

r,  d  h  c  r  h  d 

r,  Ah  c  H  f  {r  \ 
- X  ^  fv{r,  c) 

r,  3xA  h  c 

_  r,ci\-  c  r,  C2  h  c 

jT,  Cl  A  C2  h  c  r,  Cl  A  C2  h  c 


The  set  ^  of  cc  agents  is  given  by  the  following  grammar: 

A  ::=  c  \  c  A  \  A  A  A  \  AV  A  \  3xA  |  p{x) 

where  c  stands  for  a  constraint,  A  for  parallel  composition,  V  for  non-determinism 
and  3a:  for  hiding  of  a  variable  x.  In  an  agent  A  =  c  A  Ai  A  •  •  •  A  An,  if  c  is  a 
constraint,  the  main  constructor  of  each  A*  is  not  A,  and  no  A*  is  a  constraint, 
we  call  c  the  store  of  A.  It  is  the  ‘constraint’  part  of  A. 

Recursion  is  obtained  with  declarations: 

D  e\  p{x)  =  A  \  D,D 

The  operational  semantics  is  given  in  the  style  of  the  Chemical  Abstract 
Machine  [3]  (see  also  [19]).  This  presentation,  though  different  from  a  logic  pro¬ 
gramming  one,  has  the  advantage  of  keeping  track  of  the  variable  bindings,  and 
we  find  it  therefore  cleaner  to  manage  logically. 

•  The  structural  congruence  =  is  the  least  congruent  equivalence  such  that 
(A/=,A,T)  is  an  abelian  monoid,  (A/=,V)  is  an  abelian  semi-group,  A  and 
V  are  distributive  with  respect  to  each  other,  and  such  that,  for  all  agents  A  and 
B: 


3xT  =  T  3x3yA  =  3y3xA 

A  and  B  are  o'-convertible  x  is  not  free  in  A 

A  =  B  3x(A  A  B)  =  A  A  3xB 


•  The  transition  relation  — is  the  least  reflexive  transitive  congruence  such 
that: 
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cA(c-vA)  — ^  cAA 

A*  =  A  A  — ^  B 

A' 


c\-cd  (pjx)  =  A)eP 
c  — S'  d  p(^x)  — s.  A 

B  =  B'  A^C  B  —^C 
Ay  B  — yC 


Remarks: 

►  This  version  of  cc  is  monotonic  in  the  following  sense:  for  any  transition 
c  l\A  — )■  d  AB  between  agents  c  A  A  and  d  AB  with  respective  stores  c  and  d, 
there  exists  a  transition  cAA  — s-  cAdAB  (even  though  d  might  not  entail  c, 
as  constraints  may  vanish  because  of  the  second  rule  for  the  transition  relation) . 

►  The  non-deterministic  construct  V  is  not  the  angelic  non-deterministic  con¬ 
struct  of  [10],  it  is  close  to  that  of  Lincoln  and  Saraswat  in  [13]:  intuitively  AV  B 
can  be  either  A  or  B,  but  you  cannot  decide,  your  vision  is  ‘blurred’,  so  to  be 
able  to  do  a  ‘sharp’  observation  on  the  rest  of  the  computation,  both  possibilities 
must  have  some  common  result  C  (a  ‘coincidence’);  it  is  a  form  of  hiding  at  the 
level  of  agents,  in  the  same  way  as  3  is  a  hiding  of  variables. 

Agents  and  declarations  not  involving  V  are  said  deterministic. 

►  Concurrent  constraint  programming  languages  are  parameterized  by  a  con¬ 
straint  system,  which  is  often  not  mentioned  explicitly  in  the  operational  se¬ 
mantics  as  well  as  in  the  constraint  ent ailment.  Declarations  and  constraint 
entailment  are  also  implicit  in  the  operational  semantics,  but  the  context  will 
not  allow  any  confusion. 

Examples: 

►  c  A  {d  ^  Ay  e  ^  A)  =  [c  A  d  ^  A)  y  (c  A  e  A)  (distributivity) 

— ^  (d  A  d  — >  A)  V  (e  A  e  — )■  A)  if  c  he  d  and  c  he  e 

— >  (d  A  A)  V  (e  A  A) 

— ^  (TAA)  V(TAA) 

=  A  V  A  =  A  (transition  rule  for  V) 

►  c{x)  A  3a:(c(a:)  — c(a:))  suspends  because  3x{c{x)  c(a;))  is  not  a  constraint 

and  therefore  cannot  be  ercised;  whereas: 

c{x)  A  3a!(T  — )■  c(a;))  =  c{x)  A  3x(T  A  T  ^  <^(^))  (~^~  neutral  for  A) 

— c{x)  A  3x(T  A  c(a:))  =  c(x)  A  3x(c(a:))  — c{x)  (3a:(c(a?))  is  a  constraint) 

3  Observations  with  intuitionistic  logic 

3.1  Monotonic  stores 

In  concurrent  constraint  programming,  the  permanent  information  is  expressed 
by  the  store,  therefore  the  stores  are  very  natural  observations  on  cc  agents. 
Indeed  this  is  the  approach  of  Lincoln  and  Saraswat  [13]:  the  stores  of  monotonic 
cc  agents  can  be  observed  with  intuitionistic  logic  IL. 


401 


Let  (C,  Ihc)  be  a  monotonic  constraint  system,  and  V  a  set  of  declarations. 
Let  IL(C,I>)  be  the  deduction  system  obtained  by  extending  IL  with: 

-  elements  of  I  he  as  non-logical  axioms, 

—  for  each  declaration  p{x)  =  A  in  L?,  the  sequent  p{x)  A  as  non-logical 
Eixiom. 

H  denotes  the  inverse  of  h  and  A  +  5  stands  for  A  B  and  A\~  B. 

Theorem  1  (Soundness)  Lti  A  and  B  be  cc  agents.  If  A  =  B  then  A  H 
\~iL(e,v)  ^  ^  ^  ^iL(c,v)  B. 

Proof  Trivial  induction  on  =  and  — ® 

Theorem  2  (Observation  of  monotonic  stores)  Let  A  he  a  cc  agent  and 
c  a  constraint.  If  A  \~il(c,v)  ^  then  there  are  constraints  ci...Cn  and  agents 
Bi...Bn  such  that  A  — ^  (ci  A  5i)  V  •  •  •  V  (cn  A  Bn)  and  for  all  i,  a  he  c. 

Sketch  of  proof  It  is  simpler  to  prove  the  result  for  multisets  of  agents 
Ai...  An:  if  Ai...  An  h  c  in  IL(C,X>),  then  Ai  A  ...  A  An  — >  c.  We  proceed 
by  induction  on  a  (sequent  calculus)  proof  of  A\ ..  .An  h  c:  remark  that  this 
works  because  the  formula  on  the  right  is  a  constraint  in  each  sequent  of  the 
proof.  Each  logical  rule  simulates  a  transition  rule  of  Icc,  where  commas  on 
the  left  of  sequents  stand  for  parallel  composition.  For  axioms,  cut,  contraction 
and  A-rules,  it  is  evident.  Idem  for  the  left  introductions  of  V  and  T .  Since 
a  constraint  c  contains  only  A  and  3,  the  only  other  right  rule  to  consider  is 
the  right  introduction  of  3,  for  which  the  induction  hypothesis  applies,  by  the 
definition  of  constraint  entailment.  For  weakening,  just  remark  that  conjunction 
A  is  distributive  with  respect  to  V,  and  the  hypothesis  applies.  The  only  non¬ 
trivial  cases  are: 

1.  the  ^  left  introduction: 

r,Ahc  Ahd 

r,A,d^  Ah  c 

By  induction  hypothesis,  A  — >•  (di  A  Bi)  V  •  •  ■  V  (dfc  A  B^),  and  for  all  j,  di  he  d, 
so  for  all  j ,  (ij  A  A  (d  A)  — y  dj  AEj  AA.  Set  F  =  {diAEi)V  ■  ■ -V  (4  AEk). 
Then  E  AAA{d-^  A)  — y  E  AF  A(d^  A)  — y  EAF  A  A.  Again  by  induction 
hypothesis,  E  AA  — i-  (c,  A  Bi)  V  ■  •  •  V  (c„  A  S„)  and  for  all  i,  ct  he  c.  Set 
G  =  (ci  A  Bi)  V  ■  ■  •  V  (c„  A  Bn).  Then  E  A  A  Ad  -y  A  — .  F  A  G,  and  the 
distributivity  of  A  w.r.t.  V  enables  to  conclude. 

2.  the  3  left  introduction: 

— ^  —  X  not  free  in  T,  c 

r,  3xA  h  c 

By  induction  hypothesis,  F  AA  — »■  (ci  A  Bi)  V  •  ■  •  V  {cn  A  B„),  so  3a:(r'  A  A)  > 
3a:((ciABi)V-  •  •V(c„ABn)).  Now  x  is  not  free  in  T,  so  32c(rAA)  =  F A3xAy  hence 
FABxA  — >  3a:((ciABi)V-  •  ♦V(c„AB„)),  with  32r((ci ABi)V- « •V(cnABn))  h  3xc. 
Now  a:  is  not  free  in  c,  so  3a;c  =  c,  and  the  result  is  proved.  ■ 
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3.2  Denotational  semantics? 

In  this  section  we  show  through  examples  that  the  fixed  points  of  (monotonic) 
cc  agents  (hence  their  denotational  semantics  [23,  10])  cannot  be  characterized 
via  intuitionistic  logic,  even  in  the  deterministic  case. 

Let  (Cjlhc)  be  a  monotonic  constraint  system,  and  suppose,  to  simplify,  that 
the  set  of  declarations  is  empty.  Again  to  simplify,  we  suppose  that  agents  are 
deterministic.  In  [23],  deterministic  cc  agents  are  viewed  as  continuous  closure 
operators  on  the  set  of  constraints  C]  an  agent  A  is  determined  by  its  set  [A]  of 
fixed  points: 

[c]  =  {deC\d^c  c}, 

[c  — )■  A]  =  {d  G  C  I  d  he  c  implies  d  G  [A]}, 

[AAB]  =  [A]n[Bl 

[3ajA]  =  {c  G  C  I  there  exists  d  G  [A]  such  that  3xc  A~c  3xd}. 

By  the  first  equality  above,  a  logical  characterization  of  the  denotational 
semantics  of  cc  agents  should  be:  c  G  [A]  iff  c  \-il(c,v)  A.  But  then  take  A  = 
d  B:  by  the  second  equality,  c  \-tl(c,v)  A  iff  c  G  [A]  ilf  (c  I/e  d  or  c  G  [S]) 
iff  (c  1/e  d  or  c  B).  This  is  impossible,  since  c  h  d  ^  B  is  by  no  means 

equivalent  to  c  1/  d  or  c  h  B:  for  instance,  if  c  is  a:  >  3,  d  is  a:  >  4  and  B  is  just 
the  constraint  a;  >  5,  it  is  true  that  c  1/  d,  but  c  1/  d  — »•  B.  The  problem  is  the 
instanciation  of  free  variables. 

Another  approach  to  the  denotational  semantics  of  cc  is  the  partial  correct¬ 
ness  criterion  of  [7]. 


3.3  Successes  and  suspensions? 

The  declarative  nature  of  usual  logic  programming  and  constraint  logic  pro¬ 
gramming  relies  essentially  on  the  logical  observation  of  the  successes  (and  fail¬ 
ures)  of  a  program.  It  is  therefore  natural  to  look  for  a  similar  characteriza¬ 
tion  in  the  cc  setting.  We  define  the  successes  of  a  cc  computation  starting 
with  A  to  be  the  stores  c  such  that  A  — j-  c.  Note  that,  since  c  he  1  holds 
for  any  constraint  c,  a  success  may  just  be  part  of  a  final  store.  Other  inter¬ 
esting  observations  would  be  the  suspensions  of  a  computation,  i.e.  the  agents 
B  =  c  A  (di  — >•  Ai)  A  •  •  •  A  (d„  A„)  such  that  A  — B  and  for  no  i,  c  he  dj. 

Intuitionistic  logic  does  not  enable  the  observation  of  successes,  neither  that 
of  suspensions,  even  in  the  deterministic  case  and  without  program  declarations. 

Let  (C,\\-c)  be  a  constraint  system. 

►  H:  It  is  not  true  in  general  that  A  -\  F  (for  F  a  success  or  a  suspension) 
implies  A  — >•  F .  For  instance  c  — j-  d  H  d  but  c  d  can  suspend,  and  have  thus 
no  success.  Besides  d  H  d  A  (c  ^  d)  and  if  we  do  not  have  d  h  c,  d  A  (c  d)  is  a 
suspension,  whereas  the  constraint  c  is  not  a  suspension. 
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^  h:  Similar  problems  arise  with  h.  dA(c  A)\-  d  whereas  dA{c  A)  suspends 
as  soon  as  we  do  not  have  d  c.  Besides  d  A  (d  — >■  e)  h  d  — >  e,  but  d  A  (d  e) 
has  a  success  (d  A  e)  and  does  not  suspend. 

^  -h;  Equivalence  +  is  of  no  help.  Suppose  we  do  not  have  d  h  c,  and  consider 
the  following  equivalence:  dA(c  — y  d)  “f*  d.  It  does  not  allow  to  conclude  anything 
about  the  operational  behaviour  of  the  agents  d  and  d  A  (c  d). 

The  obstacle  is  the  structural  rule  of  (left)  weakening.  Therefore  we  move  to 
linear  logic.  At  the  same  time  it  is  natural  to  move  to  a  non-monotonic  version 
of  cc,  Icc,  already  introduced  by  Saraswat  and  Lincoln  [22]  and  further  studied 
by  [4,  25]. 

4  Non-monotonic  cc 

A  linear  constraint  system  is  a  pair  (C,  IHc)  where:  C  is  a  set  of  formulas  (the 
linear  constraints)  built  from  a  set  V  of  variables,  a  set  E  of  function  and  relation 
symbols,  and  logical  operators  1,  the  multiplicative  conjunction  (g),  the  existential 
quantifier  3  and  the  exponential  !;  and  IhcC  C*  x  C.  We  assume  0  has  neutral 
1.  Instead  of  ((ci  . .  .c„),c)  Gibe,  we  write  ci . .  .c„  Ibc  c. 

he  is  the  least  reflexive  and  transitive  relation  C  C*  x  C  containing  I  he  and 
closed  by  the  following  rules: 


r,  Cl ,  C2  h  c 

rhci 

Ahc2  r,c\-  d  Ahc 

T,  Cl  0  C2  h  c 

T,  A  h  Cl  0  C2  r,A\-d 

r,A^c 

ri-3xc  r,3xA\-c 

IT  he 

r\~d 

r,c\-d  r,  !c,  !c  h  d 

\r\-\c 

r,  !c  h  d 

r,<.c\-  d  r,'.ci-  d 

They  are  the  rules  of  intuitionistic  linear  logic  (ILL)  for  0,  3  and  !,  plus  the 
cut  rule.  The  syntax  of  Icc  agents  is  given  by  the  following  grammar: 

A::=c\c^  A\A^A\  AkA  \  3xA  |  p(x) 

where  0  stands  for  parallel  composition,  and  —o  for  suspension.  In  an  agent 
A  =  c0Ai0'--0An,  ifeis  a  constraint,  the  main  constructor  of  each  A*  is 
not  0,  and  no  Ai  is  a  constraint,  we  call  c  the  store  of  A.  It  is  the  ‘constraint’ 
part  of  A. 

Recursion  is  obtained  with  declarations: 

D  e\  p(x)  =  A  \  D,D 

•  The  structural  congruence  =  is  the  least  congruent  equivalence  such  that 
(A/=,0,1)  is  a  abelian  monoid,  (A/=,&)  is  an  abelian  semi-group,  and  such 
that,  for  all  agents  A  and  B: 
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3x1  =  1  3x3yA  =  3y3xA 
X  is  not  free  in  A 


A  and  B  are  a-convertible 


3x{A  0  5)  =  A  0  3xB 


A  =  B 

c  —o  (ASzB)  =  (c  — o  A)k,{c 


B) 


•  The  transition  between  agents  — >■  is  the  least  reflexive  transitive  congruence 
such  that: 


Ak.B 


c  0  (c  — o  A)  — >■  A 


A'  =  A 


— >  A  AhB  B 


c\-c  d  (p(®)  =  A)  e  P 

c  — >•  d  p(x)  — A 

A-^B  B  =  B' 

A'  — ^  B' 

A  0  (B&C)  — ^  (A  0  B)k(A  0  C) 


Remarks: 

^  Constraints  are  ‘consumed’  by  suspensions;  therefore  the  rule  for  c  —o  A 
involves  non- determinism  since  several  stores  may  satisfy  the  condition  of  the 
rule. 

►  The  non-deterministic  A^B  (already  considered  in  [22,  4])  can  behave  either 
like  A  or  like  jB,  it  has  both  capabilities.  Note  that  this  non- determinism  is 
different  from  that  of  V  in  monotonic  cc. 

Agents  and  declarations  not  involving  &  are  said  deterministic. 

►  The  exponential  I  allows  to  recover  the  monotonic  czise  cc;  for  this  reason,  we 
just  allow  it  on  constraints. 

Translation  of  deterministic  cc  into  Icc 

In  the  next  section  we  show  that  IMALL  (intuitionistic  multiplicative  additive 
linear  logic)  enables  the  observation  of  the  successes  of  Icc  agents.  To  be  able 
to  observe  the  successes  of  monotonic  cc  agents,  one  has  to  prove  that  cc  agents 
can  be  translated  into  Icc  ones,  in  such  a  way  that  the  operational  semantics  is 
preserved.  We  do  it  for  deterministic  cc  agents.  The  non-deterministic  case  can 
be  translated  into  Icc  as  well,  but  it  requires  the  use  of  the  additive  disjunction 
0,  which  we  do  not  consider  in  this  paper  to  avoid  confusion.  We  postpone  the 
treatment  of  full  monotonic  cc  to  further  work. 

Let  (C,  Ihc)  be  a  constraint  system.  The  linear  constraint  system  (C,  Ihc)!  has 
the  same  set  of  atomic  constraints  (with  T  renamed  1).  The  associated  compound 
constraints  and  deterministic  cc  agents  are  translated  into  linear  constraints  and 
Icc  agents: 


405 


=!c,  if  c  is  atomic  p(a!)^  =  p(®) 

(c  ^  =  c  i^xA)^  — 

{A  A  By  =  0  5t 

Observe  that  c  is  a  (monotonic)  constraint  iff  is  a  (linear)  constraint.  The 
proof  of  the  following  proposition  is  then  straightforward: 

Proposition  1  Lei  c  and  d  be  monotonic  constraints:  c  \~cc  d  iff  c^  h/cc  d^ .  Let 
A  and  B  be  deterministic  cc  agents:  A  =cc  B  iff  A^  =icc  B^ ,  A  — ^cc  B  iff 
At  Bt. 

For  the  atoms,  A  and  3,  our  translation  is  Girard’s  second  tranlation  of 
intuitionistic  logic  into  linear  logic  [8,  p-81].  The  only  difference  is  the  translation 
of  (c  -o  At  instead  of  !(ct  — o  At)),  which  essentially  forbids  the  erasure  of 
suspensions. 


5  Observing  successes  with  IMALL 

Let  (C,  Ihc)  be  a  fixed  linear  constraint  system,  and  be  a  fixed  set  of  declara¬ 
tions. 

Let  (C,  Ihc)  be  a  linear  constraint  system,  and  V  a  set  of  declarations.  Let 
IMALL(C,X>)  be  the  deduction  system  obtained  by  extending  IMALL  with: 

—  elements  of  I  he  as  non-logical  axioms, 

-  for  each  declaration  p{x)  =  A  in  X>,  the  sequent  p{x)  h  A  as  non-logical 
axiom. 

We  define  the  successes  of  an  Icc  computation  starting  with  A  to  be  the 
stores  c  such  that  A  — >•  c.  The  suspensions  of  A  are  the  agents  B  =  c  (g)  (di  -o 
Ai)  0  •  •  •  0  {dn  — o  An)  such  that  A  — B  and  for  no  i,  c  he  di. 

Theorem  3  (Soundness)  Let  A  and  B  be  Icc  agents. 

If  A  =  B  then  A  -h-jMALL{C,v)  B,  If  A  — ^  B  then  A  hjMALL(c,v)  B. 

Proof  Trivial  induction  on  =  and  — ■ 
A  success  for  an  Icc  agent  A  is  a  linear  constraint  c  such  that  A  — c. 

Theorem  4  (Observation  of  successes)  Lei  A  be  an  Icc  agent,  and  c  be  a 
linear  constraint.  If  A  hjMALL(C,v)  c,  then  A  — >  c,  i.e,  c  is  a  success  for  A. 

Sketch  of  proof  It  is  simpler  to  prove  the  result  for  multisets  of  agents 
Ai . . .  A„:  if  Ai  . . .  A„  h  c  in  IMALL(C,X>),  then  Ai  0 . .  .0  An  — ^  c.  We  proceed 
by  induction  on  a  (sequent  calculus)  proof  of  Ai  . . .  A„  h  c.  Each  logical  rule 
simulates  a  transition  rule  of  Icc,  where  commas  on  the  left  of  sequents  stand 
for  parallel  composition.  For  axioms,  cut,  0-rules  and  !-rules,  it  is  evident.  Since 
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a  constraint  c  contains  only  0,  3  and  !,  the  only  other  right  rule  to  consider  is 
the  right  introduction  of  3,  for  which  the  induction  hypothesis  applies,  by  the 
definition  of  constraint  entailment.  It  is  evident  for  the  left  introductions  of 
and  1: 

r,A\~c  r,B\-c  r\-c 

Fy  AkB  he  r,  AkB  he  T,  1  h  c 

The  only  non-trivial  cases  are: 

1.  the  — o  left  introduction: 

r,  A  \-c  A  \-d 

r,  Ayd  —o  A\-  c 

By  induction  hypothesis,  A  — >■  dy  so  {F^ A<S>d  -o  A)  — {r^d<S)d  ^  A)  — >■ 
(r0A).  Again  by  induction  hypothesis,  (r0A)  — s-  c,  so  (r0Z\0d  -o  A)  — >  c. 

2.  the  3  left  introduction: 

r,A\-c  ,  .  ^ 

- X  not  free  in  i  ,  c 

Fy  3xA  h  c 

By  induction  hypothesis,  (F  0  A)  — >•  c,  so  3x{r  0  A)  — )■  3xc  But  x  is  not 
free  in  c,  so  3xc  =  c.  And  as  a;  is  not  free  in  F,  3x{r  0  A)  =  (T  0  3xA)y  hence 
(r  0  3xA)  — ^  c.  ■ 

Thanks  to  the  translation  from  cc  to  Icc,  the  result  holds  for  cc  agents  as 
well. 

It  is  worth  noting  that  a  success  is  not  proved  to  be  a  constraint  entailing 
the  initial  agent  (as  in  constraint  logic  programming),  but  entailed  by  the  initial 
agent.  This  change  of  perspective  is  not  very  surprising  in  fact,  since  suspensions 
c  — o  A  contain  implicitly  a  kind  of  negation  (under  the  form  of  linear  implication 
— o  as  we  shall  see),  which  reverses  the  sense  of  deduction. 


6  Discussion 

We  have  shown  that  linear  logic  enables  to  do  finer  observations  on  cc  and  Icc 
agents  than  intuit ionistic  logic. 

6.1  The  limits  of  the  correspondence 

This  result  emphasizes  the  correspondence  between  the  transition  relation  — ^ 
on  agents  and  the  entailment  relation  h  in  intuitionistic  linear  logic,  and  it  is 
therefore  a  first  step  towards  viewing  concurrent  constraint  programming  as  (a 
fragment  of  some  version  of)  linear  logic.  The  purpose  of  this  paragraph  is  to 
make  more  precise  the  limits  of  this  approach. 

A  look  at  the  proof  of  Theorem  4  shows  that  the  left  introduction  rules  of 
IMALL  sequent  calculus  just  correspond  to  transitions  in  Icc.  The  rules  for  0 
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express  the  monoidal  structure  of  0.  Program  declarations  have  been  oriented 
(p(aj)  =  A  is  translated  intop(aj)  h  A):  then  an  axiom  p(a;)  h  A  just  corresponds 
to  the  replacement  of  the  agent  p{x)  by  the  agent  A. 

On  the  contrary,  the  right  introductions  of  — o  and  3  do  not  correspond 

to  transitions  of  agents.  Let  us  look  at  them  more  closely: 

—  The  right  introduction  of  Sz: 

r\-  A  r\-  B 
r\-AkB 

expresses  the  combination  of  two  experiments  starting  with  the  same  agent. 
You  know  that  AkB  can  behave  either  like  A  or  like  B  (left  introduction), 
and  now  you  postulate  the  converse,  i.e.  that  observing  either  Aot  B  from  F 
means  F  — AkB.  The  right  introduction  of  k  is  equivalent  to  AkA  +  A. 

—  The  right  introduction  of  — o: 

rA\-  B 
F\-A-oB 

expresses  a  ^ porosity^  of  suspensions:  for  instance,  A  ^  {c  —o  B)  c  —o 
(A<S>  B), .  .A  strong  form  of  porosity  is  considered  in  process  calculi  like  the 
TT-calculus  [16],  namely  enablement:  for  a  guard  u  such  that  no  free  variable 
in  A  is  bound  in  cj,  it  states  that  a;  (A  05)  =  -40  (w5).  But  it  is  not 
considered  in  cc,  and  as  we  shall  see  in  the  next  paragraph,  it  forbids  the 
observation  of  suspensions. 

—  The  right  introduction  of  3: 

F  h  A[t/x] 

r\-3xA 

expresses  blindness.  The  agent  refuses  the  communication  with  the  environ¬ 
ment  through  variable  (or  channel)  a?,  what  can  lead  it  to  deadlock. 

The  significance  of  these  remarks  is  twofold: 

1.  they  are  limits  to  the  correspondence  between  deduction  in  IMALL  and  the 
operational  semantics  of  cc  and  Icc  languages, 

2.  their  rather  intuitive  operational  interpretation  (at  least  for  the  first  two 
ones)  deserves  further  consideration. 


6.2  Suspensions 

Concurrent  constraint  programming  suggests  another  interesting  observation, 
namely  the  suspensions  of  an  agent:  a  suspension  for  an  Icc  agent  A  is  an  agent 
5  =  c  0  (di  -o  Ai)  0  •  •  •  0  (dn  -o  A„)  such  that  A  — B  and  for  no  z,  c  he  d,-. 

The  above  remark  on  the  right  introduction  of  — o  shows  that  suspensions 
cannot  be  observed  in  the  setting  of  IMALL:  for  instance,  c  0  (c  — o  1)  h  c  -o 


408 


(c(g)l),  a  suspension,  whereas  c<^{c  ^  1)  succeeds  with  1.  This  'porosity’  is  linked 
to  the  lack  of  a  sequential  composition  connective  in  ILL.  Such  a  connective  needs 
to  be  non-commutative.  Abrusci  [1]  studied  a  pure  non-commutative  version 
LL,  without  commutative  and  additive  (&)  connectives,  and  we  need  at  least  a 
commutative  multiplicative  connective  (the  ‘parallel’ connective).  Retore ’s  before 
connective  <  [20]  is  not  a  solution  either,  since  A  0  {B  <  C)  \-  B  <  (A  C). 

The  next  step  of  our  investigation  will  be  to  define  a  non-commutative  version 
of  linear  logic,  which  copes  with  this  difficulty. 


6.3  The  significance  for  linear  logic  and  concurrency 

Other  (linear)  logical  approaches  have  been  proposed  to  study  concurrency  with 
the  approach  of  logic  programming.  Andreoli  and  Pareschi  [2]  point  out  that 
the  ‘proof-search  as  computation’  analogy  for  linear  logic  corresponds  to  a  re¬ 
active  paradigm,  but  in  Linear  Objects,  synchronous  message  passing  involves 
extra-logical  operators  (‘tell  markers’),  whereas  concurrent  constraint  program¬ 
ming  is  asynchronous,  what  avoids  the  resort  to  such  extra-logical  operators. 
Miller  [15]  describes  a  connection  between  the  7r-calculus  and  linear  logic,  but 
uses  non-logical  constants  as  well;  a  connection  a  la  Miller  between  Boudol’s 
asynchronous  version  of  the  7r-calculus  [5]  and  our  work  should  be  interesting, 
Perrier  [18]  proposes  a  denotational  semantics  based  on  the  phase  semantics,  to 
model  the  interaction  capability  of  a  process.  Our  paper  focuses  on  concurrent 
constraint  programming,  so  the  two  approaches  are  different,  but  we  think  a 
thorough  comparison  would  be  interesting.  Kobayashi  and  Yonezawa  [11]  define 
several  concurrent  semantics  for  linear  logic  processes,  including  bisimulation; 
we  believe  the  relationship  with  concurrent  constraint  programming  through 
our  work  should  be  worth  investigating. 
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Abstract,  A  globally  consistent  labeling  is  a  compact  representation  of 
the  complete  solution  space  for  a  constraint  satisfaction  problem  (CSP). 
Constraint  satisfaction  is  NP-complete  and  so  is  the  construction  of  glob¬ 
ally  consistent  labelings  for  general  problems.  However,  for  binary  con¬ 
straints,  it  is  known  that  when  constraints  are  convex,  path-consistency 
is  sufficient  to  ensure  global  consistency  and  can  be  computed  in  poly¬ 
nomial  time  .  We  show  how  in  continuous  domains,  this  result  can  be 
generalized  to  ternary  and  in  fact  arbitrary  n-ary  constraints  using  the 
concept  of  (3,2)-relational  consistency.  This  leads  to  polynomial- time  al¬ 
gorithms  for  computing  globally  consistent  labelings  for  a  large  class  of 
numerical  constraint  satisfaction  problems. 


1  Introduction 

Many  problems,  ranging  from  resource  allocation  and  scheduling  to  fault  diag¬ 
nosis  and  design,  involve  numerical  constraint  satisfaction  as  an  essential  compo¬ 
nent.  These  problems  often  represent  complex  decision  processes  where  the  set 
of  variables  and  constraints  involved  is  not  independent  of  particular  solutions 
and  where  relevant  information,  in  the  form  of  active  constraints  and  variables, 
is  revealed  only  as  the  task  proceeds  and  decisions  are  taken.  In  the  case  where 
variables  and  constraints  are  numerical,  the  search  space  for  such  problems  be¬ 
comes  of  an  unbounded  size;  each  numerical  value  may  trigger  a  different  active 
context  and  thus,  potentially  lead  to  a  different  solution.  • 

Figure  1  shows  an  example  from  civil  engineering  where  different  values  for 
beam  depth  and  beam  span  lead  to  different  design  options  and  constraints. 
Choosing  the  values  of  the  beam’s  depth  and  span  within  regions  3  or  4  would 
increase  the  susceptibility  to  vibrations  and  involve  installing  bridging  (lateral  re¬ 
inforcements)  for  damping  the  floor  (Figure  1,  -a-).  Choosing  these  values  within 
region  1,2  or  3  makes  it  possible  to  have  the  ventilation  ducts  go  under  the  beams 
while  choosing  them  in  region  4  would  dictate  to  make  opening  in  the  beams  to 
allow  passage  of  the  ducts. 

Identifying  single  point  solutions,  possibly  optimal  according  to  some  crite¬ 
rion  is  the  viewpoint  adopted  by  almost  all  the  existing  mathematical  solvers 
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ranging  from  linear  and  non-linear  programming  to  numerical  analysis  and  stochas¬ 
tic  techniques.  Alternatively,  consistency  techniques  offer  the  possibility  of  pro¬ 
ducing  a  compact  description  of  the  space  of  all  solutions  by  assigning  labels 
(sets  of  legal  values)  to  individual  variables  or  combination  of  variables.  This  is 
essential  for  reasoning  about  design  alternatives,  as  in  the  example  of  Figure  1. 

While  in  general,  computing  globally  consistent  labeling  is  NP-hard,  recent 
results  [8]  show  that  in  the  case  where  constraints  are  convex^  low  orders  of  con¬ 
sistency  are  equivalent  to  global  consistency.  For  binary  constraints  (involving 
at  most  2  variables),  it  has  been  shown  that  3-consistency  (also  called  path- 
consistency  and  computable  in  polynomial  time)  is  equivalent  to  global  con¬ 
sistency  [8],  [2].  However  in  discrete  domains,  it  has  been  shown  [9]  that  the 
generalization  of  these  results  to  ternary  (and  higher  arity)  constraints  may  in¬ 
volve  significantly  higher  degrees  of  consistency  and  thus  complexity.  In  this 
paper,  we  show  that  much  more  positive  results  can  be  obtained  in  continuous 
domains. 

In  fact,  we  introduce  a  concept  of  (3,2)-relational  consistency  which  can  be 
computed  in  polynomial  time  and  proven  equivalent  to  global  consistency  for 
constraint  networks  containing  ternary  constraints  as  well. 

We  also  show  how  these  results  can  be  reliably  implemented  in  practice  using 
an  appropriate  representation  of  continuous  constraints. 


2  Problem  statement 

In  this  work  we  consider  constraint  satisfaction  problems  in  continuous  domains. 
Variable  domains  are  intervals  over  the  reals  and  constraints  are  numerical  equal¬ 
ities  and  inequalities  of  arbitrary  types  and  arities.  For  practical  considerations, 
the  methods  developed  target  problems  where  both  variables  and  constraints 
have  physical  interpretations  and  can  be  handled  with  limited  degrees  of  preci¬ 
sion,  as  is  the  case  in  many  engineering  applications. 

A  continuous  CSP  (CCSP),('P  =  (V,  X),  i?)),  is  defined  as  a  set  V  of  variables 
xijX2i . .  taking  their  values  respectively  in  a  set  D  of  continuous  domains 
Di,  X)2, . . . ,  X^n  and  constrained  by  a  set  of  relations  i^i, . . . ,  Rm-  A  domain  is  an 
interval  of  and  a  relation  is  defined  intensionally  by  a  set  of  arbitrary  equalities 
and  inequalities. 

Given  a  CCSP,  we  require  that  for  each  subset  of  variables  (xi,...a?fc), 
a  unique  relation  R(xi  ...Xk)  exists  in  the  underlying  constraint  network.  In 
words,  each  hyper-arc  of  the  constraint  network  will  be  labeled  by  a  total  con¬ 
straint  [3].  We  recall  that  a  total  constraint  between  a  set  S  of  variables  is  given 
as  the  region  formed  by  combining  all  mathematical  constraints  involving  S. 
We  define: 

Definition!.  (Convex  relation) 

Let  V  =  (y,  (7,  X))  be  a  CCSP.  A  relation  R{xij..Xk)  over  C  is  convex  if  it 
determines  a  convex  solution  space  in  the  domain  Dxj_  • . .  x  where  Di  €  D. 
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Fig.  1.  Many  CCSPs  are  embedded  within  complex  decision  processes 


A  CCSP  is  called  convex  when  all  its  relations  are  convex.  In  a  globally 
consistent  network,  any  partial  consistent  instantiation  of  a  subset  of  variables 
can  be  extended  to  a  solution  with  no  backtracking  [1],  a  process  which  can 
generally  carried  out  in  linear  time.  For  both  simple  temporal  problems  and  row- 
convex  discrete  problems,  it  has  been  observed  that  convexity  of  the  constraint 
relations  means  that  path-consistency  is  sufficient  to  ensure  a  globally  consistent 
labeling.  This  result  is  proven  using  Helly’s  Theorem: 

Theorem 2  (Helly).  Let  F  be  a  finite  family  of  at  least  n  -(-  1  convex  sets  in 
BF  such  that  every  n  +  1  sets  in  F  have  a  point  in  common.  Then  all  the  sets 
have  a  point  in  common. 

Helly ’s  Theorem  can  be  applied  to  show  that  for  each  assignment  of  n  vari¬ 
ables  xi,X2,  .^.yXn,  there  exists  a  consistent  value  which  can  be  assigned  to 
Xn+i  in  the  following  way.  Since  the  constraint  network  is  binary,  the  only  con¬ 
straints  existing  between  xi,  and  Xn+i  are  individual  constraints  between 

Xiji  £  {!..,«}  and  Since  the  constraints  are  convex,  every  variable  x,- 

already  assigned  constrains  x„-|.i  to  a  single  interval.  Path- consistency  ensures 
that  every  pair  of  such  intervals  intersect  each  other.  Thus,  by  Helly ’s  Theorem, 
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Fig.  2.  HeUy’s  Theorem  in  :  if  a  finite  set  of  binary  convex  regions  is  such  that 
each  triplet  of  regions  has  a  non-null  intersection,  then  the  whole  set  of  regions  has  a 
non-null  common  intersection  (shaded  area) 


there  must  exist  a  common  intersection  of  all  the  intervals  (i.e.,  at  least  one  value 
for  Xn-^i)  which  is  consistent  with  all  previous  assignments,  and  consequently 
the  assignment  can  be  extended. 

3  Convex  n-ary  CCSPs 

N-ary  continuous  and  discrete  CSPs  can  be  translated  into  ternary  ones  without 
loss  of  information  [4] .  To  generalize  the  result  from  binary  discrete  networks  to 
ternary  discrete  CSPs,  van  Beek  and  Dechter  [9]  have  introduced  the  notion  of 
relational  path-consistency  for  discrete  problems: 

Definitions  (van  Beek  &  Dechter).  Let  7^  be  a  network  of  relations  over  a 
set  of  variables  X,  and  let  IZs  and  IZt  be  two  relations  in  ft,  where  S,T  C  X. 
We  say  that  11s  and  Ht  are  relationally  path- consistent  relative  to  variable  x  iff 
any  consistent  instantiation  of  the  variables  in  {S\JT)  —  {x},  has  an  extension 
to  X  that  satisfies  11$  and  IIt  simultaneously.  A  pair  of  relations  Us  and  Ht  is 
relationally  path- consistent  iff  it  is  relationally  path- consistent  relative  to  each 
variable  in  A  network  is  relationally  path-consistent  iff  every  pair  of 

relations  is  relationally  path- consistent. 

By  definition,  relational  path-consistency  guarantees  for  each  set  of  relations 
having  a  variable  x  in  common,  that  the  pairwise  intersections  of  their  unary 
projections  over  the  x  axis  are  non-empty.  Belly’s  Theorem  becomes  thereby 
applicable,  which  results  in  the  following  Theorem  [9]: 

Theorem 4  (van  Beek  &  Dechter).  Let  11  he  a  network  of  relations  that  is 
relationally  path- consistent.  If  there  exists  an  ordering  of  the  domains  Di . .  .Dn 
of  It  such  that  the  relations  are  row  convex,  the  network  is  globally  consistent. 

Relational  path-consistency  ensures  pairwise  non-null  intersection  of  unary 
projections,  with  the  objective  of  applying  Belly’s  Theorem  in  one  dimension 
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Procedure  3-2-rel-con(V,C,D) 
repeat 

changed  false 

for  each  pair  (u,v),  u,v  €  V  do 

for  each  ternary  tuple  (i,j,k),  i,j,k  €  V  do 
begin 

i-  c(i,j,k)  0  j  *.)  c(i,u,v)  0  c(j,u,v)  0  c(k,u,v) 
if  c’(i,j,k)  /  c(i,j,k)  then 
begin 

c(i,j,k)  <-  c»(i,j,k) 
changed  <-  true 
end 

end 

until  changed  =  false 

Fig.  3.  Algorithm  for  computing  a  (3-2)-relationaUy  consistent  labeling. 


to  individual  variables.  Composing  pairs  of  ternary  relations  (with  a  variable  in 
common)  results  in  a  relation  of  arity  five.  Thus,  for  global  consistency  it  might 
be  necessary  to  guarantee  that  a  set  of  four  variables  is  extensible  to  a  fifth  one. 
But  this  means  that  it  might  be  necessary  to  ensure  relational  path-consistency 
for  relations  of  arity  four  — and  recurrently,  for  relations  of  unbounded  arity, 
thus  possibly  engendering  an  intractable  complexity  in  the  most  general  case. 

The  alternative  generalization  we  propose  is  based  on  the  observation  that 
the  extensibility  of  a  ternary  set  of  variables  to  a  binary  region  (rather  than  to 
a  unary  one  like  for  relational  path- consistency)  does  not  involve  relations  with 
arity  greater  than  3  and  thus  removes  the  causes  behind  combinatorial  explosion. 
This  approach  implies  that  Belly’s  Theorem  must  be  applied  in  two  dimensions 
rather  than  one  (see  Figure  2).  For  the  case  of  ternary  networks,  we  introduce 
the  notion  of  (3,2)-relational  consistency  which  guarantees  that  each  triplet  of 
relations,  with  two  variables  in  common,  has  a  non-null  intersection. 

Definitions.  (Extension) 

Let  T  =  {V,C,D)  he  a  constraint  satisfaction  problem.  Let  Vi  and  V2  be  subsets 
of  VI  has  an  extension  to  V2  if  any  consistent  instantiation  of  the  variables 
in  Vi  can  be  extended  to  a  consistent  instantiation  of  the  variables  in  Vi  U^2- 

Definitions.  ((3,2)-relational  consistency) 

Let  'P  be  a  ternary  network  of  relations  over  a  set  of  variables  X.  Let  (xi  ,u,v), 

u,  v)  and  Ri^(xs,  u,v)  be  three  relations  of  N  which  share  two  variables 
u  and  V,  where  u  might  be  identical  to  v.  Rj^^,  Rj^  and  Rj^  are  (3,2)-relationally 
consistent  relative  to  {u,  u}  if  and  only  if  any  consistent  instantiation  of  the  3 
variables  in  xi,X2,X3  has  an  extension  to  {u,u}  that  satisfies  Rj^,  Rj^  and  Rj^ 
simultaneously. 

Since  (3,2)-relational  consistency  only  requires  labels  between  at  most  three 
variables,  it  does  not  add  to  the  arity  of  a  ternary  constraint  network.  Provided 
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that  each  binary  projection  is  convex,  (3,2)-relational  consistency  enables  the 
application  of  Helly^s  Theorem  in  two  dimensions.  However,  Helly’s  Theorem 
only  guarantees  that  each  pair  of  variables  has  a  non-empty  domain.  It  remains 
to  show  that  the  constraints  on  each  individual  variable  are  also  non-empty. 

The  simple  algorithm  of  Figure  3  terminates  with  a  set  of  (3,2)-relational  con¬ 
sistent  set  of  labels. 

This  algorithm  takes  as  input  a  ternary  CCSP,  T  =  (V^  C,  D)  where  V  is  the 
set  of  variables,  C  the  set  of  constraints,  and  D  is  the  set  of  variable  domains,  c 
denotes  the  label  of  a  relation  in  C.  R  denotes  relations  in  C. 

Using  Helly’s  Theorem  in  two  dimensions  we  can  state  the  following  result: 

Theorem  7.  For  any  convex  ternary  network  V,  (3,2)-relational  consistency 
will  either  decide  that  Vis  inconsistent,  or  else  compute  an  equivalent  globally 
consistent  labeling  ofV,  in  time  O(n^)  where  n  is  the  number  of  variables  ofV. 

Informally,  the  proof  consists  of  the  following  steps: 

1.  we  first  prove  that  when  applied  to  a  ternary  network  V^  the  algorithm  for 
(3,2)-relational  consistency  results  in  an  empty  network  if  a  given  pair  of 
variables  has  an  empty  label, 

2.  in  order  to  show  that  non-empty  labels  on  each  pair  of  variables  implies 
global  consistency,  we  introduce  a  binary  dual  representation  of  the  original 
ternary  problem.  This  dual  representation,  by  the  fact  that  it  is  binary,  makes 
it  easier  to  show  how  an  instantiation  process  can  be  carried  out  backtrack 
free  when  binary  labels  are  non-empty, 

3.  the  dual  representation  is  shown  to  be  globally  consistent  and  equivalent  to 
the  primal  one. 

The  binary  dual  representation  of  a  ternary  network  V{V,C,D)  is  a  binary 
network,  Vd{VdyCdiDd),  such  that: 

-  Vrf  =  {ai, . .  .Qfm}-  A  variable  of  Vd,  otj,  represents  a  pair  of  variables  in  the 

original  network  (an  element  ^{3,2'^  of  so  that  /  ^{3,2''^ 

-  a  domain  of  Dd  is  an  element  of 

-  a  relation  between  two  variables,  and  of  Vd  is  the  relation 

R{!^(u,i),  2?(u,2)5  X(v,2))  resulting  from  the  composition  of  the  relations 

between  ®(v,2)  in  original  problem  V. 

In  the  following,  the  fact  that  each  pair  of  variables  has  a  non-empty  label  will 
be  referred  to  as  the  binary- extensibility  property: 

Definition  8.  (Binary-extensibility) 

Let  Vy  (y,C,  D),  be  a  network  of  relations.  V  is  said  to  be  binary-extensible  if 
any  subset  of  U’s  variables  has  an  extension  to  any  pair  of  variables  of  V. 

We  now  present  the  intermediate  results  needed  for  stating  Theorem  7  along 
with  sketches  of  their  proofs. 
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Lemma9.  LetV'  he  a  (3,2)-relaiionaUy  consistent  ternary  network  and  letV'^ 
be  its  dual  representation.  The  following  propositions  are  verified: 
i.  each  partial  solution  ofV'  corresponds  to  a  partial  solution  ofVj 
a.  each  partial  solution  ofV^^  corresponds  to  a  partial  solution  ofV' 

This  results  follows  immediately  from  the  definition  of  the  dual  network  rep¬ 
resentation. 


Corollary  10.  LetT'  he  a  (S,2)-relationally  consistent  ternary  network  and  let 
he  its  dual  representation.  V'  is  equivalent  to 

V'  and  are  equivalent  in  the  sense  that  each  solution  of  the  first  network  is 
also  a  solution  of  the  latter  one,  and  vice  versa.  Since  Lemma  9  is  stated  for 
arbitrary  partial  instantiations  it  also  hold  for  a  global  instantiation. 

Lemma  11.  LetV  be  a  ternary  network  of  relations,  V'  he  its  (3,2)-relationally 
consistent  counterpart  and  V'^  he  the  dual  representation  of  V* .  If  V  has  no 
solution,  is  empty. 

Sketch  of  proof.  An  inconsistent  network  P  is  a  fortiori  not  binary-extensible 
(i.e  there  exists  at  least  one  pair  of  variables  with  an  empty  label).  Since  the 
algorithm  for  (3,2)-relational  consistency  computes  the  closure  of  V  with  re¬ 
spect  to  binary-extensibility,  it  will  therefore  necessarily  results  in  an  empty 
(3,2)-relationally  consistent  representation  V' .  V'  being  equivalent  to  (Corol¬ 
lary  10),  is  also  empty  □. 

Lemma  12.  LeiV  he  a  ternary  network  of  relations,  V*  he  its  (3,2)-relationally 
consistent  counterpart  and  he  the  dual  representation  ofV'.  If  is  non¬ 
empty,  is  globally  consistent. 


Sketch  of  proof.  Suppose  that  k  —  1  variables  of  V'^  have  been  consistently  in¬ 
stantiated.  Using  Helly’s  Theorem  in  two  dimensions,  we  first  show  [5]  that  in 
the  case  where  the  relations  of  are  not  empty,  each  consistent  instantiation 
of  three  variables  in  can  be  consistently  extended  to  a  fourth  one.  This  guar¬ 
antees  that  each  subset  of  three  P^’s  relations,  R(ak,  ««),  R{(Xk,  otb),  R{ak,  ^c), 
(where  (a,6,  c)  €  [1../:  -  1])  has  a  non-null  projection  over  Since  each  re¬ 
lation  is  a  convex,  non-empty  region  of  3^^  (variables  ai,i  € 

[1..A:  —  1]  are  instantiated),  Helly’s  Theorem  is  applicable  and  guarantees  that 
-^(afcjQ^t)  /  0-  This  means  that  ak  can  be  instantiated  consistently. 
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This  result  holds  for  an  arbitrary  k,  hence  P^  is  globally  consistent  □. 


Theorem  7  follows  immediately  from  Lemmas  9,  11  and  12.  P  being  a  ternary 
convex  network  of  relations,  P'  its  (3,2)-relationally  consistent  counterpart  and 
P^  the  dual  representation  of  V ,  Lemma  9  guarantees  the  equivalence  of  P'  and 
P^,  Lemma  11  ensures  that  an  inconsistent  P  results  in  an  empty  P^  (and  hence 
P')  representation,  and  finally.  Lemma  12  guarantees  that  if  a  solution  exists, 
P^  (and  hence  P')  is  globally  consistent. 
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Complexity.  The  number  of  relations  checked  for  binary-extensibility  is  initially 
in  0(n^  +  n^)  =  O(n^).  Each  time  a  relation  is  modified,  0(n^  +  n)  =  O(n^) 
new  (3,2)-relational  compositions  are  computed.  The  global  time  complexity  of 
(3,2)-relational  consistency  is  therefore  0(n®)  □. 

Given  that  k  variables  {xi,X2...Xk}  of  V  have  already  been  instantiated, 
finding  a  value  for  a  third  variable  is  always  possible:  it  amounts  to  find¬ 
ing  a  value,  in  for  a  node  (a?jb+i, ajj),  j  =  [1..^].  A  possible  backtrack-free 
instantiation  procedure  for  deriving  the  solutions  of  V  would  be  as  follows: 

1.  Choose  a  value  Xi  of  xi  that  satisfies  R{xi) 

2.  For  2  <—  2  to  n  do 

3.  li  <-  n7;i..t-i  Hxi  Ai,  Xj) 

4.  Xj  choose  a  value  for  Xi  in  U 

4  Partial  and  directional  convexity  properties 

Constraint  convexity  is  a  rather  strong  condition,  but  it  turns  out  that  weaker 
forms  of  convexity  are  often  sufficient  to  satisfy  the  conditions  of  a  globally 
consistent  labeling. 

Partially  convex  binary  CCSPs  We  introduce  in  [4,  5]  a  new  category  of 
partial  convexity  called  {x)-convexiiy  for  binary  relations.  This  property  is  more 
restrictive  than  path  and  simple  connectivity  but  guarantees  that  convexity  is 
maintained  while  enforcing  relational  consistency. 

Definition  13.  ((x)-Convexity  [4]) 

Let  R  he  a,  binary  relation  defined  by  a  set  of  algebraic  or  transcendental 
constraints  on  two  variables  xi,X2-  Ris  said  to  be  a^fc-convex  in  the  domain  Dxj, 
if  for  any  two  points  qi  and  q2  of  r  such  that  the  segment  qiq2  is  parallel  to  Xk, 
qiq2  is  entirely  contained  in  r. 

A  network  is  said  to  be  (x)-convex  if  each  of  its  relations  R{x^y)  is  (x)-convex. 
(x)-convexity  guarantees  the  convexity  of  any  unary  projection  of  a  given 
relation  [5].  This  allows  the  formulation  of  the  following  result  [4]. 

Theorem  14.  A  binary  CCSP  which  is  (x)-convex  and  path- consistent  is  mini¬ 
mal  and  decomposable 

The  (a!)-convexity  property  is  non- conservative  with  respect  to  intersection. 
Hence,  the  global  consistency  property  stated  by  Theorem  14  is  only  guaranteed 
for  the  a  posteriori  network  computed  by  path- consistency. 

Directional  (x)-convexity  In  the  case  where  a  discrete  network  does  not  sat¬ 
isfy  the  row-convexity  property,  van  Beek  shows  that  directional  row-convexity 
remains  a  useful  property  for  obtaining  backtrack-free  solutions.  Similar  results 
generalizes  to  the  case  of  (x)-convex  relations.  The  following  Theorem  states  [4] 
that  a  partial  (x)-convexity  of  the  network  is  sufficient  to  ensure  that  a  solution 
can  be  determined  without  backtracking. 
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Theorem  15.  Let  J\f  he  a  path- consistent  binary  constraint  network.  If  there 
exists  an  ordering  of  the  variables  xi,  ...Xn  such  that  each  relation  R{xi,Xj)  of 
Af  with  1  <  j  <  i,  is  {xi)-convex,  then  a  consistent  instantiation  can  be  found 
without  backtracking. 

Directional  row-convexity  imposes  ordering  conditions  on  both  variables  and 
variable  domains.  Since  discrete  CSPs  do  not  have  the  strictly  ordered  domains 
characterizing  continuous  CSPs,  the  fact  that  a  constraint  network  is  row-convex 
can  sometimes  be  hidden.  This  is  obviously  not  the  case  for  continuous  CSPs 
concerning  (x)- convexity. 

Partially  convex  n-ary  relations  Similarly  to  the  case  of  binary  constraints, 
a  less  restrictive  convexity  condition  can  be  defined  for  n-ary  constraints.  We 
first  propose  the  following  generalization  of  the  (x)-convexity  property: 

Definition  16.  ((ici, . .  .a;*;)- Convexity) 

Let  R  be  an  n-ary  relation  between  n  variables  xi ..  .x^^  R  is  said  to  be  (xi , . . . 
convex  in  the  domains  DiX...D^^  if  for  any  two  points  qi  and  q2  of  r,  such  that 
the  segment  qiq2  is  on  a  plane  parallel  to  Xi ...  x  Xk ,  qi q2  is  entirely  contained 
in  r 

Informally,  this  means  that  a  relation  is  (xi, , .  .Xjb)-convex  if  any  sub-projection 
over  the  subset  (xi, . .  .xjb)  yields  a  convex  k-a.Ty  region.  In  the  case  of  networks 
of  arity  r ,  the  composition  of  two  maximal  arity  constraints  having  at  least  one 
variable  in  common,  results  in  a  relation  of  arity  2r  —  1.  In  analogy  to  the  case  of 
ternary  networks,  we  observe  that  the  extension  of  an  r-ary  set  of  variables  to  a 
region  of  arity  r  - 1  does  not  involve  relations  with  arity  greater  than  r.  To  apply 
Helly  s  Theorem  we  must  introduce  the  notion  of  (r,r-l)-relational  consistency 
which  guarantees  that  each  set  of  r  relations  having  r  —  1  variables  in  common 
has  a  non-null  intersection: 

Definition  17.  ((r,r-l)-relational  consistency) 

Let  P  be  a  network  of  relations  over  a  set  of  variables  X,  of  arity  r.  Let 
2/1)  •••3  2/r-i)j  •  •  2/1, ...,  7/r-i)  be  r  relations  of  N  sharing  the  r-1 

variables  {i/i, . .  The  relations  are  (r,r-l)-relationally  consistent  relative 

to  the  shared  variables  if  and  only  if  any  consistent  instantiation  of  the  vari¬ 
ables  in  {xi,  ...,Xr}  has  an  extension  to  {i/i, . .  .i/r-i}  that  satisfies  all  relations 
simultaneously.  The  network  V  is  relationally  (r,r-l)  consistent  if  and  only  if  all 
relations  are  (r,r-l)- consistent  with  respect  to  all  subsets  of  shared  variables. 

Hence,  the  following  generalization  of  Theorem  14  can  be  proposed: 

Theorem  18.  Let  V  be  a  constraint  network  of  arity  r  at  most^  (a^i3  •  •  .,Xr-i)- 
convex.  IfV  is  (r,r-l)-relationally  consistent,  then  it  is  globally  consistent. 

Proof.  The  proof  is  similar  to  the  one  given  for  Theorem  7.  In  the  general  case, 
the  nodes  in  the  binary  dual  representation  of  7>  represent  (r  -  l)-ary  subsets  of 
P’s  variables. 
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Directional  (x,y)- convexity  In  the  proof  of  Theorem  7,  convexity  of  the  con¬ 
straints  is  used  only  in  the  application  of  Helly’s  Theorem  to  ensure  extensibility 
of  partial  solutions  of  the  dual  network.  Here,  it  would  be  sufficient  to  have  con¬ 
vexity  hold  only  in  two  of  the  three  dimensions  involved  in  the  constraint-  Thus, 
we  have  the  following  Theorem: 

Theorem  19.  Lei  V  he  a  (3,2)-relationally  consistent  ternary  constraint  net¬ 
work.  If  there  exists  an  ordering  of  the  variables  xi,...Xn  such  that  for  any 
ijyk  :  I  <  i  <  j  <  k  <  n,  R{xi,Xj,X]i)  is  {xj ,X}:)-convex,  then  the  network 
is  globally  consistent  and  a  consistent  instantiation  can  be  found  without  back¬ 
tracking. 

Sketch  of  proof  According  to  Helly’s  Theorem  in  two  dimensions,  the  fact  that 
each  ternary  relation  R{xi,Xj,Xk)  is  (xj , ajfc)-convex  guarantees  that  the  bi¬ 
nary  relations  R{xj ,  Xk)  derived  from  the  problem  are  non-empty  {R{xj ,  Xk)  = 
ri  -*  fc  n«F  construction,  these  binary  relations  are  convex  and 

have  non-null  pairwise  intersections.  Consequently,  a  similar  argument  as  the 
one  given  for  the  proof  of  Theorem  7  hold  and  instantiation  can  be  carried  out 
backtrack-free  □. 

5  Exploiting  convexity  in  practice 

In  discrete  domains,  relations  are  represented  simply  as  enumerations  of  values 
or  value  combinations.  In  continuous  domains,  sets  of  individual  values  are  often 
compact  and  can  be  represented  by  one  or  a  small  collection  of  intervals.  How¬ 
ever,  representing  and  manipulating  labels  of  several  variables^  as  it  is  necessary 
for  implementing  algorithms  for  higher  degree  of  consistency  than  two  is  more 
involved  as  they  may  be  complex  geometric  shapes. 

Constraint  representation  In  [4],  we  propose  to  represent  numerical  con¬ 
straints  using  2*^-trees  (a  hierarchical  representation  of  space  commonly  used  in 
vision  and  spatial  reasoning  [7]).  The  2*-trees  representation  of  constraints  is 
based  on  the  observation  that  in  most  practical  applications  each  variable  takes 
its  values  in  a  bounded  domain  (bounded  interval)  and  there  exists  a  maximum 
precision  with  which  results  can  be  used. 

Provided  that  these  two  assumptions  hold,  a  relation  defined  by  inequalities 
can  be  approximated  by  carrying  out  a  hierarchical  binary  decomposition  of  its 
solution  space  into  2*-trees  (quadtrees  for  binary  relations,  octrees  for  ternary 
ones  etc. . . )  (see  Figure  4).  In  order  to  provide  a  unified  framework  for  handling 
both  inequalities  and  equalities,  we  propose  in  [4]  to  translate  equalities  into 
a  weaker  form  called  toleranced  equalities  [6]:  the  final  grey  nodes  of  the  2*- 
tree  decomposition  for  an  equality  constraint  are  replaced  by  white  nodes.  This 
amounts  to  replacing  each  equality  by  two  inequalities  close  to  each  other  and 
is  acceptable  in  practice  as  long  as  a  the  results  can  be  identified  with  a  limited 
precision. 
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Fig.  4.  A  continuous  relation  can  he  approximated  hy  carrying  out  a  hierarchical  bi¬ 
nary  decomposition  of  its  solution  space  into  a  2^-tree  where:  white  nodes  represent 
completely  legal  solution  regions,  grey  nodes  partially  legal  and  partially  illegal  regions 
and  black  nodes  completely  illegal  ones 


Using  this  discretized  representation,  the  sets  of  feasible  value  combinations 
can  be  interpreted  and  manipulated  explicitly.  The  main  advantage  is  that  the 
use  of  complex  numerical  tools  for  solving  sets  of  simultaneous  constraints  (stated 
implicitly  by  their  mathematical  expressions)  can  be  avoided.  The  approximate 
solution  region  defined  by  a  set  of  several  constraints  is  constructed  by  project¬ 
ing,  composing  and  intersecting  their  individual  2*-tree  representations.  These 
operations  on  2*- trees  are  easy  to  implement  and  extensively  studied  in  com- 
puter^ vision,  computer  graphics  and  image  processing.  Moreover,  constructing 
the  2  -tree  representation  for  a  single  constraint  only  requires  evaluating  an  in¬ 
dividual  constraint  equation  at  certain  points  in  the  space  [6].  As  we  show  in 
[4,  5],  the  explicit  handling  of  solution  regions  using  2*-trees  conveys  a  sim¬ 
ple  implementation  for  path-  and  higher  degrees  of  consistency  in  continuous 
domains. 


Correctness  of  the  representation  A  2^-tree  representation  can  be  inter¬ 
preted  as  providing  two  different  approximations  for  a  feasible  region: 

-  the  inner  content  approximation,  Z(5),  is  given  by  the  white  nodes  (interior 
nodes)  and  is  entirely  enclosed  within  the  solution  space.  Since  all  values 
within  X{S)  are  consistent,  it  is  a  sound  approximation.  However,  some  so¬ 
lutions  maybe  missing  from  this  representation, 

-  the  closest  outer  content  approximation,  0{S),  is  given  by  the  union  of  the 
white  and  grey  nodes  (interior  nodes  U  boundary  nodes).  This  approximation 
is  guaranteed  to  contain  all  solutions,  but  it  may  be  not  sound  since  the  grey 
nodes  contain  inconsistent  values. 

With  regard  to  equalities,  remember  that  this  method  only  allows  toleranced 
equalities  and  soundness  will  only  hold  with  respect  to  these  tolerances.  For  the 
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initial  2^-tree  representations  of  individual  constraints,  we  can  always  guaran¬ 
tee  that  X(5)  and  (!?(«S)  are  as  close  as  possible  to  the  actual  solution  region. 
However,  constructing  total  constraints  and  enforcing  consistency  involves  com¬ 
position  and  intersection  operations  on  constraints. 

Letting  Si  0  <S2  denote  the  solution  space  resulting  from  the  intersection  of 
Si  and  52.  We  can  show  the  following  properties  (see  [5]): 

-  The  inner  content  approximation  is  exact  with  respect  to  intersection: 
X(5i)eX(52)=X(5i0  52) 

-  The  outer  content  approximation  may  contain  spurious  nodes  after  intersec¬ 
tion:  0{Si  0  52)  C  C?(5i)  0  0(52) 

For  composition,  it  is  possible  to  show  that  the  projection  of  a  constraint  into 
a  higher- dimensional  space  is  exact  for  both  inner  and  outer  approximations. 
Therefore,  the  X(5)  representation  of  total  constraints  is  exact,  even  after  exe¬ 
cuting  consistency  algorithms.  This  means  that  it  is  both  sound  (not  containing 
any  spurious  inconsistent  values)  as  well  as  maximal  in  the  sense  that  there  is 
no  larger  sound  X(5)  approximation,  for  a  given  precision.  On  the  other  hand, 
the  0{S)  representation  computed  by  logical  combination  of  simultaneous  con¬ 
straints  is  complete  but  not  sound  with  respect  to  the  minimal  enclosing  approx¬ 
imation  0{S)  —  spurious  grey  nodes  can  be  created  by  intersections. 

2*'-trees  and  convexity  Since  the  2*-tree  decomposition  generates  stepwise 
approximations  of  the  boundaries,  convexity  is  obviously  not  preserved  in  the 
strict  mathematical  sense.  In  [5],  we  show  that  when  the  resolution  chosen  is 
insufficient,  situations  may  occur  where  a  connected  solution  space  is  represented 
by  disconnected  or  even  empty  X(5)  representation.  However,  these  limitations 
are  compensated  by  the  fact  that: 

-  the  X(5)  representation  of  a  convex  solution  space  can  be  empty  or  discon¬ 
nected  only  when  the  solution  of  the  CCSP  falls  within  the  limit  of  resolution 
chosen  for  the  2*-tree  representation.  This  situation  is  therefore  restricted 
to  limit  cases, 

-  when  the  X{S)  representation  of  a  convex  solution  space  is  disconnected,  a 
single  additional  level  of  decomposition  is  then  sufficient  to  make  the  repre¬ 
sentation  connected  again. 

Hence,  if  a  disconnection  occurs  (limit  cases),  it  is  consequently  possible  either 
to  resort  to  further  refinements  of  the  quadtrees  or  to  neglect  the  solution  re¬ 
gion  within  the  disconnected  area — considering  that  its  identification  requires  a 
precision  having  no  significance  for  the  application.  Moreover,  a  particular  class 
of  minimal  convexity  deficiencies  can  be  identified  which  precludes  the  risk  of 
disconnection  (see  [5]).  The  2*-trees  having  minimal  deficiencies  of  this  type  are 
said  to  be  convex. 

Checking  for  convexity  When  constraints  are  approximated  using  quadtrees, 
we  show  in  [5]  that  the  (x)-convexity  property  can  checked  for  in  0{Nlog4{N)-\- 
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2^v/«)  where  N  is  the  number  of  feasibility  nodes,  Dy  is  the  domain  size  of  vari¬ 
able  a?  (i.e.  interval  length)  and  €  the  minimal  interval  length  of  x  in  the  quadtree 
decomposition.  Similarly,  convexity  can  be  checked  for  in  0{2.Nlog4,{N)+2^+^) 
where  N  is  the  number  of  feasibility  nodes,  D  is  the  maximal  domain  size  in  the 
quadtree  (i.e.  interval  length)  and  £  the  minimal  interval  length  in  the  quadtree 
decomposition  (for  a  fixed  precision,  this  complexity  is  0{Nlog4(N)).  Finally,  us¬ 
ing  analog  procedures,  the  (ici,  a;2)-convexity  property,  useful  for  solving  ternary 
problems,  can  be  checked  for  in  0{N  +  Nlog4(N)),  for  a  fixed  precision.  These 
simple  convexity  checking  procedures  examine  exhaustively  the  boundary  nodes 
of  the  2^-tree  representations.  For  detailed  descriptions  we  refer  the  reader  to 
[5]. 


Comparison  with  the  discrete  case —  2^-trees  and  back  track-free  search 
It  is  worth  mentioning  that  the  results  on  n-ary  constraints  (see  section  3)  are 
not  directly  transferable  to  discrete  domains.  Ensuring  backtrack-free  search  in 
ternary  constraints  requires  convexity  conditions  to  hold  in  rather  that  Jft. 
We  have  shown  that  (3,2)-relational  consistency  is  equivalent  to  global  consis¬ 
tency,  but  the  backtrack-free  instantiation  might  require  refining  the  resolution 
of  different  variables.  This  is  possible  in  continuous  domains,  but  not  possible  if 
we  use  a  continuous  domain  to  represent  a  discrete  problem. 

Consider  the  following  example  where  two  matrices  representing  discrete  re¬ 
lations  are  understood  as  showing  convex  solution  regions.  When  we  intersect 
the  two  regions,  the  result  is: 


/ 

V 


1 


© 


Tn  0 
\  \ 


0  0 
0  0 


i.e.  the  intersection  has  been  lost  as  it  is  smaller  than  the  resolution  limit. 
In  a  continuous  problem,  we  can  now  refine  the  resolution  to  make  this  problem 
go  away,  and  continue  with  the  instantiation.  But  in  a  discrete  problem,  we  do 
not  have  this  possibility  as  the  maximum  resolution  is  fixed.  A  path-consistent 
labeling  does  not  guarantee  that  we  can  in  fact  successfully  complete  a  backtrack- 
free  instantiation  without  need  for  increasing  the  resolution,  and  hence  does  not 
guarantee  global  consistency  in  a  discrete  problem  where  this  possibility  does 
not  exist. 

On  the  other  hand,  extending  the  row-convexity  property  to  2D,  so  that 
Belly’s  Theorem  can  ensure  the  binary  extensibility  condition,  would  dictate 
that  each  discrete  ternary  relation  yields  a  universal  matrix  (with  Is  only)  as 
binary  projection,  which  is  probably  too  restrictive  for  practical  use. 


6  Example 

We  now  sketch  out  how  the  introductory  example  of  Figure  1  is  solved  using  our 
method.  In  this  example,  four  main  independent  variables,  beam  depth(ift),  slab 
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Fig.  5.  Three  constraints  from  the  steel  structure  problem.  The  areas  within  the 
dashed  lines  are  those  removed  while  enforcing  global  consistency 


thickness(/r5),  beam  span(W)  and  beam  spacing(S)  are  linked  together  through 
the  following  non-linear  constraints: 
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Traditionally,  an  engineer  works  through  these  equations  hierarchically;  at 
no  time  is  the  complete  solution  set  known.  Exploration  of  possible  solutions  is 
done  point  by  point  according  to  the  experience  of  the  designer. 

Using  our  system,  a  prototype  Lisp  implementation  calculates  the  globally 
consistent  solutions  in  ^  1800  seconds  on  a  Silicon  Graphics  Indigo  with  an 
R4000  processor.  Figure  5  shows  a  set  of  constraints  derived  from  the  problem 
after  global  relaxation:  the  areeis  within  the  dashed  lines  are  those  removed  by 
(3,2)-relational  consistency,  (w  is  an  intermediate  variable  used  when  transform¬ 
ing  the  original  problem  into  a  ternary  one,  u  —  SlS.10~^Hg  *  S  +  0.0054). 

This  shows  that  the  problem  admits  in  fact  a  large  space  of  potential  solu¬ 
tions,  of  which  current  mathematical  methods  only  find  a  single  one. 


7  Conclusion 

Convexity  has  been  shown  to  be  a  useful  property  for  efficiently  solving  bi¬ 
nary  discrete  and  temporal  constraint  satisfaction  problems.  In  this  paper,  we 
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propose  a  generalization  of  these  results  to  continuous  constraints  of  arbitrary 
arities.  While  in  discrete  domains  it  has  been  shown  that  the  generalization  of 
the  results  on  convexity  to  ternary  (and  higher  arity)  constraints  may  pose  com¬ 
plexity  problems,  we  introduce  a  concept  of  (3,2)-relational  consistency  which 
can  be  computed  in  polynomial  time  and  proven  equivalent  to  global  consistency 
for  constraint  networks  containing  ternary  constraints  as  well.  Since  n-ary  con¬ 
straint  problems  can  always  be  transformed  into  equivalent  ternary  ones,  these 
results  guarantee  polynomial-time  solution  for  a  large  class  of  continuous  n-ary 
problems.  We  also  show  how  these  results  can  be  exploited  in  practice.  The  ap¬ 
plicability  condition  of  these  results  is  that  a  limited  precision  must  exist  under 
which  the  results  have  no  significance.  This  condition  holds  for  almost  all  the 
engineering  applications  manipulating  physical  entities. 
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Abstract.  We  present  an  experimental  comparison  of  three  modified 
DeltaBlue  algorithms  for  local-propagation-based  constraint  solving.  Our 
three  modified  methods  are  respectively  called  DeltaDown  method, 
DeltaUp  method  and  DeltaCost  method.  These  methods  were  designed 
to  speed  up  the  planning  phase  or  the  evaluation  phase  of  the  original 
DeltaBlue  method  using  additional  cost  functions  to  break  a  tie  of  the 
w2dkabout  strength.  Our  cost  functions  are  respectively  called  up  cost 
and  down  cost.  These  cost  functions  can  give  us  information  about  the 
upstream  and  the  downstream  constraints.  Our  experiments  show  that 
DeltaUp  method  brings  us  a  considerable  improvement  of  the  total  per¬ 
formance  of  DeltaBlue  method  using  a  small  overhead  of  keeping  the  cost 
function. 


Keyvrords:  Constraint  solving  algorithm.  Local  propagation,  DeltaBlue 
method,  DeltaDown  method,  DeltaUp  method,  and  DeltaCost  method 

1  Introduction 

DeltaBlue  algorithm  is  a  widely  used  constraint  solving  method  based  an  lo¬ 
cal  propagation  [1-6].  The  purpose  of  this  paper  is  to  present  an  experimental 
comparison  of  three  modified  DelataBlue  algorithms. 

These  modified  algorithms  are  respectively  called  DeltaDown  method,  Delta¬ 
Up  method  and  DeltaCost  method.  These  methods  were  designed  to  hopefully 
speed  up  the  planning  phase  or  the  evaluation  phase  of  DeltaBlue  method  with¬ 
out  using  expensive  bookkeeping  operations.  Our  experiments  show  that  the 
planning  phase  of  DeltaUp  method  works  at  least  as  fast  as  that  of  DeltaBlue 
method  and  that  DeltaUp  method  brings  us  a  considerable  improvement  of  the 
total  performance  of  DeltaBlue  method. 

The  organization  of  the  rest  of  this  paper  is  as  follows.  In  Section  2  we  review 
basic  definitions  of  constraint  problems  and  DeltaBlue  method  .  In  Section  3  we 
present  our  three  modified  DeltaBlue  methods.  In  Section  4  we  give  comparisons 
of  our  methods  with  DeltaBlue  method.  In  Section  5  we  give  our  conclusion. 
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2  Preparation 

2.1  Definitions 

We  summarize  basic  definitions  for  constraint  problems  and  constraint  solving 
methods  based  on  local  propagation. 

A  constraint  problem  P  consists  of  a  set  of  variables  VSET,  a  set  of  constraints 
CSETy  and  a  set  of  methods  MSET. 

A  constraint  C  of  CSET  is  a  relation  of  values  of  variables  Vi, of  VSET. 
A  constraint  C  has  a  priority  level,  which  is  an  integer  from  0  to  A;.  A  priority 
level  is  also  called  the  strength  of  a  constraint.  We  assume  smaller  integers  have 
stronger  priorities.  A  constraint  of  priority  level  0  is  called  a  required  constraint. 
Required  constraints  must  be  satisfied  in  any  constraint  problem.  A  stay  con¬ 
straint  is  a  constraint  to  keep  the  same  value.  An  input  constraint  is  a  constraint 
that  the  value  is  given  as  an  input  value  from  outside, 

A  method  M  for  a  constraint  C  is  a  function  to  realize  the  constraint  relation 
of  C  such  that  some  of  variables  of  C  are  output  variables  and  the  rest  of  variables 
of  C  are  input  variables.  The  number  of  output  variables  is  called  output  degree 
of  the  method.  For  each  constraint  we  may  have  one  or  more  methods.  In  what 
follows,  we  deal  with  methods  whose  output  degree  is  exactly  one. 

We  can  represent  a  constraint  C  of  variables  Vi,  ...,  by  an  undirected 
hyperedge  connecting  points  respectively  representing  variables  Vi,  ...,  1^.  We 
can  also  represent  a  method  Af  of  C  by  a  directed  hyperedge  where  we  add  an 
outgoing  arrow  into  the  output  variable  to  an  undirected  hyperedge  of  C. 

A  constraint  graph  of  a  constraint  problem  P  is  an  undirected  hypergraph 
such  that  points  represent  variables  of  P  and  undirected  hyperedges  represent 
constraints  of  P.  We  show  an  example  of  such  a  constraint  graph  in  Fig.l.  This 
undirected  hypergraph  represents  a  constraint  problem  of  four  variables  a,  6,  c 
and  d  such  that  a  -f  5  =  c  and  c  X  d  =  e. 

A  data-flow  graph  of  P  is  a  directed  hypergraph  obtained  from  a  constrziint 
graph  of  P  by  replacing  each  undirected  hyperedge  of  C  by  a  directed  hyperedge 
representing  one  method  of  C. 

A  solution  graph  of  P  is  a  data-flow  graph  of  P  such  that  there  exist  no 
directed  cycles  or  no  two  arrows  outgoing  into  one  same  point.  In  Fig.2  we  show 
one  solution  graph  for  the  constraint  graph  of  Fig.l. 

A  locally-predicate-better  solution  of  constraint  problem  P  is  a  maximally 
better  solution  using  the  following  better  relation  for  solutions  x  and  y  of  P. 

A  solution  X  is  better  than  a  solution  y  if  and  only  if  there  exists  i  satisfying 
following  conditions. 

1,  For  constraints  of  levels  0  to  ^-1,  solutions  x  and  y  satisfy  same  constraints. 

2.  For  constraints  of  level  i,  constraints  of  y  satisfying  P  is  a  proper  subset  of 

constraints  of  x  satisfying  P. 
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Fig.  1.  A  constraint  graph  Fig.  2.  A  solution  graph 


2.2  DeltaBlue  Method 

DeltaBlue  method  is  an  incremental  constraint  solving  method  based  on  locad 
propagation.  DeltaBlue  algorithm  uses  the  walkabout  strength  to  determine  the 
direction  of  propagation.  The  walkabout  strength  is  the  weakest  level  of  priorities 
of  constraints  existing  in  the  upstream  of  the  variable  of  a  solution  graph. 

Definition  1  (Walkabout  strength)  The  walkabout  strength  of  a  variable  V 
is  the  weakest  of  walkabout  strengths  of  input  variables  of  the  method  whose 
output  variables  is  V,  and  the  level  of  priority  of  the  constraint  of  that  method.  If 
such  V  has  no  input  variables  and  no  associated  constraint,  then  the  walkabout 
strength  of  V  is  the  weakest  priority  level. 

The  outline  of  DeltaBlue  method  is  as  follows. 

Method  1  (DeltaBlue  method) 

Input  :  A  solution  graph  and  an  added  constraint  C. 

1.  We  select  a  method  of  C  whose  output  variable  has  the  weakest  walkabout 
strength  and  whose  output  variable  is  not  yet  used.  Let  the  output  variable 
of  this  method  be  F.  If  the  walkabout  strength  of  V  is  not  weaker  than  the 
strength  of  C,  then  we  are  done.  If  C  was  required,  then  we  also  declare  an 
error. 

2.  We  record  that  C  was  enforced  by  the  selected  method  and  V  was  used. 

3.  We  update  the  walkabout  strength  of  V  and  its  downstream  variables.  If  we 
find  any  of  variables  consumed  by  C  among  the  downstream  variables,  then 
we  declare  an  error. 

4.  If  a  constraint  D  had  previously  determined  F,  then  we  retract  D  and  at¬ 
tempt  to  enforce  B  by  performing  steps  1-3  on  D.  Otherwise  we  are  done. 

3  Three  Modified  Methods 

DeltaBlue  algorithm  is  an  efficient  constraint  solving  method  using  the  walk¬ 
about  strength.  However  in  DeltaBlue  algorithm,  if  walkabout  strengths  are  all 
equal,  then  we  have  no  further  criterion.  Fig.3  shows  a  situation  where  walkabout 
strengths  are  equal.  Here  priority  levels  are  strong,  medium,  weak  and  weakest. 
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Fig.  3.  A  situation  where  walkabout  strengths  are  equal 
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Fig.  4.  Another  tie  situation  of  walkabout  strengths 


Fig.4  shows  another  tie  situation  of  the  walkabout  strength.  In  order  to  choose 
a  desirable  direction  to  hopefully  speed  up  the  planning  phase  or  the  evaluation 
phase,  we  introduce  two  cost  functions  called  down  cost  and  up  cost.  The  down 
cost  of  a  variable  is  the  number  of  methods  existing  in  the  downstream  of  the 
variable.  The  up  cost  of  a  variable  is  the  number  of  methods  which  are  to  be 
selected  when  we  determine  the  value  of  the  variable  by  other  method. 

Definition  2  (Down  cost)  The  down  cost  of  a  variable  V  is  the  sum  of  down 
costs  of  output  variables  of  methods  whose  input  variables  have  V,  and  the 
number  of  methods  whose  input  variables  have  V. 

Definition  3  (Up  cost)  The  up  cost  of  a  variable  V,  whose  value  is  determined 
by  method  M  of  constraint  C,  is  the  sum  of  the  following  values. 

1.  Let  W  be  an  input  variable  of  M  such  that  it  has  the  weakest  walkabout 
strength  among  input  variables  of  M.  If  there  exist  two  or  more  such  input 
variables,  then  we  choose  an  input  variable  having  the  smallest  up  cost  U  as 
W.  The  value  is  0,  if  such  W  does  not  exist  at  all  or  the  walkabout  strength 
of  W  is  not  weaker  than  the  strength  of  C.  The  value  is  U,  otherwise. 

2.  The  number  of  methods  whose  output  variable  is  V. 

In  Fig.5  and  6  we  show  solution  graphs  with  down  costs  and  up  costs  of  variables. 

We  can  state  DeltaDown  method  as  follows.  Our  DeltaDown  method  was 
designed  to  reduce  the  number  of  recomputations  of  methods  in  the  evaluation 
phase. 
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Fig.  5.  A  solution  graph  with  down  cost 
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Fig.  6.  A  solution  graph  with  up  cost 


Method  2  (DeltaDown  method) 

Input  :  A  solution  graph  and  an  added  constraint  C. 

1.  We  select  a  method  of  C  whose  output  variable  has  the  weakest  walkabout 
strength  and  whose  output  variable  is  not  yet  used.  If  there  exist  two  or  more 
such  methods,  then  we  select  a  method  which  has  the  smallest  down  cost. 
Let  the  output  variable  of  this  method  be  V.  If  the  walkabout  strength  of  V 
is  not  weaker  than  the  strength  of  C,  then  we  are  done.  If  C  was  required, 
then  we  also  declare  an  error. 

2.  We  record  that  C  was  enforced  by  the  selected  method  and  V  was  used. 

3.  We  update  the  walkabout  strength  of  V  and  its  downstream  variables.  If  we 
find  any  of  variables  consumed  by  C  among  the  downstream  variables,  then 
we  declare  an  error. 

4.  If  a  constraint  D  had  previously  determined  V,  then  we  retract  D  and  at¬ 
tempt  to  enforce  D  by  performing  steps  1-3  on  D.  Otherwise  we  update  the 
down  cost  of  all  the  necessary  variables. 

In  Fig.7  we  show  how  DeltaDown  method  works  for  the  solution  graph  of  Fig.5. 
We  can  state  DeltaUp  method  as  follows.  Our  DeltaUp  method  was  designed 

to  reduce  the  number  of  reselections  of  methods  in  the  planning  phase. 

Method  3  (DeltaUp  method) 

Input  :  A  solution  graph  and  an  added  constraint  C. 
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1.  We  select  a  method  of  C  whose  output  variable  has  the  weakest  walkabout 
strength  and  whose  output  variable  is  not  yet  used.  If  there  exist  two  or 
more  such  methods,  then  we  select  a  method  which  has  smallest  up  cost. 
Let  the  output  variable  of  this  method  be  V.  If  the  walkabout  strength  of  V 
is  not  weaker  than  the  strength  of  C,  then  we  are  done.  If  C  was  required, 
then  we  also  declare  an  error. 

2.  We  record  that  C  was  enforced  by  the  selected  method  and  Fwas  used. 

3.  We  update  the  walkabout  strength  and  up  cost  of  V  and  its  downstream 
variables.  If  we  find  any  of  variables  consumed  by  C  among  the  downstream 
variables,  then  we  declare  an  error. 

4.  If  a  constraint  D  had  previously  determined  F,  then  we  retract  D  and  at¬ 
tempt  to  enforce  D  by  performing  steps  1-3  on  D.  Otherwise  we  are  done. 

In  Fig.8  we  show  how  DeltaUp  method  works  for  the  solution  graph  of  Fig.6. 
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Fig.  7.  DeltaDown  method 
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Fig.  8.  DeltaUp  method 


Finally  we  state  DeltaCost  method  which  uses  the  sum  of  down  cost  and  up 
cost  as  the  cost  function.  Our  DeltaCost  method  was  designed  to  reduce  the  sum 
of  the  number  of  reselections  of  methods  in  the  planning  phase  and  the  number 
of  recomputations  of  methods  in  the  evaluation  phase. 

Method  4  (DeltaCost  method) 

Input  :  A  solution  graph  and  an  added  constraint  C. 
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1.  We  select  a  method  of  C  whose  output  variable  has  the  weakest  walkabout 
strength  and  whose  output  variable  is  not  yet  used.  If  there  exist  two  or 
more  such  methods,  then  we  select  a  method  which  has  the  smallest  sum  of 
up  cost  and  down  cost.  Let  the  output  variable  of  this  method  be  V.  If  the 
walkabout  strength  of  V  is  not  weaker  than  the  strength  of  C  then  we  are 
done.  If  C  was  required,  then  we  also  declare  an  error. 

2.  We  record  that  C  was  enforced  by  the  selected  method  and  V  was  used. 

3.  We  update  the  walkabout  strength  and  up  cost  of  V  and  its  downstream 
variables.  If  we  find  any  of  variables  consumed  by  C  among  the  downstream 
variables,  then  we  declare  an  error. 

4.  If  a  constraint  D  had  previously  determined  F,  then  we  retract  D  and  at¬ 
tempt  to  enforce  D  by  performing  steps  1-3  on  D.  Otherwise  we  update  the 
down  cost  of  all  the  necessary  variables. 

4  Comparison 

For  a  constraint  problem  such  that  the  number  of  constraints  is  N  and  the  max¬ 
imum  number  of  methods  for  a  constraint  is  the  complexity  of  the  planning 
phase  of  four  constraint  solving  methods  are  as  follows. 


DeltaBlue  method 

0(MN) 

DeltaDown  method 

0(MN^) 

DeltaUp  method 

0{MN) 

DeltaCost  method 

Hence  we  need  to  measure  the  overhead  of  keeping  down  cost  and  up  cost  in  the 
planning  phcise  as  well  as  the  total  performance  of  these  methods. 

First  we  used  a  linear  chain  constraint  problem  of  Fig.  9  to  measure  the  over¬ 
head  time  in  the  planning  phase.  Because  the  linear  chain  constraint  problem  has 
no  ambiguity  from  a  viewpoint  of  the  walkabout  strength,  this  case  is  considered 
to  be  the  worst  case  for  our  modified  methods.  In  Fig.  10  we  measured  the  time 
to  create  a  long  linear  chain  from  left  to  right.  In  Fig.  11  we  measured  the  time 
to  add  an  input  constraint  to  the  left  end  of  a  chain.  In  Fig.  12  we  measured 
the  time  to  remove  the  input  constraint  from  a  chain.  These  results  show  that 
DeltaUp  method  works  at  least  as  efficiently  as  DeltaBlue  method  during  the 
planning  phase  in  spite  of  the  worst  bookkeeping  overhead. 

Secondly  we  used  a  binary  tree  maintenance  problem  of  Fig.  13  to  measure 
the  total  performance.  In  Fig.  14  we  measured  the  time  of  nine  different  types 
of  operations  to  maintain  a  binary  tree.  Operations  from  number  1  to  5  are  for 
binary  tree  creation.  Operations  from  number  6  to  7  are  for  binary  tree  moving. 
Operations  from  number  8  to  9  are  for  binary  tree  swapping.  In  Table  1  and 
2  we  respectively  show  the  number  of  reselections  of  methods  in  the  planning 
phase  and  the  number  of  recomputations  of  methods  in  the  evaluation  phase. 

These  results  show  that  DeltaUp  method  achieves  a  considerable  improve¬ 
ment  of  total  time  in  comparing  with  DeltaBlue  method. 

Our  interpretation  of  the  performance  of  DeltaUp  method  is  as  follows. 
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Fig.  9.  A  long  chain  of  constraints 


Nodes 


Fig.  10.  Time  to  create  a  long  chain 


1.  When  constraint  problems  have  ambiguity  from  a  viewpoint  of  the  walka¬ 
bout  strength,  the  variance  of  the  number  of  reselections  of  methods  in  the 
planning  phase  of  DeltaBlue  method  and  the  variance  of  the  number  of  re¬ 
computations  of  methods  in  the  evaluation  phase  of  DeltaBlue  method  tend 
to  get  larger. 

2,  When  constraint  problems  have  ambiguity  from  a  viewpoint  of  the  walka¬ 
bout  strength,  DeltaUp  method  reduces  both  the  number  of  reselections  of 
methods  in  the  planning  phase  and  the  time  for  the  planning  phase,  even  if 
we  include  the  overhead  time.The  time  for  the  evaluation  phase  may  decrease 


Time  (second) 
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Fig.  11.  Time  to  add  an  input  constraint  to  a  chain 


Fig.  12.  Time  to  delete  an  input  constraint  from  a  chain 


:  Time  (second) 


Fig.  13.  A  binary  tree  maintenance  problem 


Fig,  14.  Performance  of  four  constraint  solving  methods 
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Table  1.  The  number  of  reselections  of  methods  in  the  planning  phase 


Operation  number 

1  2  3  4  5 

6  7 

8  9 

DeltaBlue 

39  58  73  11  18 

23  47 

716 

DeltaDown 

47  63  80  11  18 

25  37 

716 

DeltaCost 

42  62  76  11  18 

25  37 

7  16 

DeltaUp 

34  49  57  8  18 

13  23 

4  11 

Table  2.  The  number  of  recomputations  of  methods  in  the  evaluation  phase 


Operation  number 

1  2  3  4  5 

6  7 

8  9 

DeltaBlue 

79  162  207  65  86 

230  255 

248  176 

DeltaDown 

90  98  151  65  86 

230  243 

248  176 

DeltaCost 

56  88  105  65  86 

230  243 

248  176 

DeltaUp 

70  125  156  88  140 

230  241 

250  190 

or  may  increase.The  reduction  of  the  time  for  the  planning  phase  contributes 
to  the  reduction  of  the  total  execution  time. 

5  Conclusion 

We  have  presented  three  modified  DeltaBlue  algorithms  to  speed  up  the  plan¬ 
ning  phase  or  the  evaluation  phase.  DeltaUp  method  brings  us  a  considerable 
improvement  of  the  performance  of  DeltaBlue  method  using  a  small  bookkeep¬ 
ing  overhead.  DeltaDown  method  and  DeltaCost  method  work  more  slowly  than 
DeltaBlue  method.  This  is  because  they  use  the  down  cost  and  increase  the  time 
for  the  planning  phase  more  than  they  reduce  the  time  for  the  evaluation  phase. 
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Abstract.  In  this  paper,  we  propose  an  extension  of  the  Jaffar-Lassez 
Constraint  Logic  Programming  scheme  that  operates  with  unions  of  con- 
strmnt  theories  with  different  signatures  and  decides  the  satisfiability  of 
mixed  constraints  by  appropriately  combining  the  constraint  solvers  of 
the  component  theories.  We  describe  the  extended  scheme  and  provide 
logical  and  operational  semantics  for  it  along  the  lines  of  those  given  for 
the  original  scheme.  Then  we  show  how  the  main  soundness  and  com¬ 
pleteness  results  of  Constraint  Logic  Programming  lift  to  our  extension. 

Keywords:  Constraint  Logic  Programming,  combination  of  satisfiability 
procedures. 


1  Introduction 

The  Constraint  Logic  Programming  scheme  was  originally  developed  in  [8]  by 
Jaffar  and  Lassez  as  a  principled  way  to  combine  the  two  computational  paradigms 
of  Logic  Programming  and  Constraint  Solving.  The  scheme  extends  conventional 
Logic  Programming  by  replacing  the  notion  of  unifiability  with  that  of  constraint 
solvability  over  an  underlying  constraint  domain.  As  originally  proposed,  the 
CLP  scheme  extends  immediately  to  the  case  of  multiple  constraint  domains  as 
long  as  they  do  not  share  function  or  predicate  symbols.  The  scheme  though 
does  not  account  for  the  presence  of  mixed  terms,  terms  built  with  functors  from 
different  signatures,  and  corresponding  mixed  constraints.  The  reason  for  this  is 
that,  although  the  CLP  scheme  allows  in  principle  multiple,  separate  constraint 
theories,  each  with  its  own  constraint  solver,  it  is  not  designed  to  operate  on 
their  combination^  to  which  mixed  constraints  belong. 

This  paper  proposes  an  extension  of  the  scheme  to  include  constraint  do¬ 
mains  built  as  the  combination  of  a  number  of  independent  domains,  such  as, 
for  instance,  the  domains  of  finite  trees  and  real  numbers,  the  domains  of  lists, 
strings,  and  integers,  and  the  like. 

In  principle,  we  can  always  instantiate  the  CLP  scheme  with  a  suitable  con¬ 
straint  domain  once  we  have  a  constraint  solver  for  it,  no  matter  whether  the 
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domain  is  simple  or  “composite”.  For  composite  constraint  domains  however  it 
is  desirable  not  to  have  to  build  a  solver  from  scratch  if  a  constraint  solver  is 
already  available  for  each  component  domain. 

A  lot  of  research  has  been  done  in  recent  years  on  domain  combination  (see, 
for  instance  [2,  3,  13,  16])  although  most  of  the  efforts  have  been  concentrated 
on  unification  problems  and  equational  theories  (see  [1,  4,  5,  6,  10,  14,  19], 
among  others).  The  results  of  these  investigations  are  still  limited  in  scope  and 
a  deep  understanding  of  many  model-  and  proof- theoretic  issues  involved  is  still 
out  of  reach.  In  spite  of  that,  we  try  to  show  the  effectiveness  of  combination 
techniques  by  choosing  one  of  the  most  general  and  adapting  it  so  that  it  can  be 
incorporated  in  the  CLP  scheme  with  few  modification  of  the  scheme  itself. 


1.1  Notation  and  Conventions 

We  adhere  rather  closely  to  the  notation  and  definitions  given  in  [15]  for  what 
concerns  mathematical  logic  in  general  and  [9]  for  what  concerns  constraint 
logic  programming  in  particular.  We  report  here  the  most  notable  notational 
conventions  followed.  Other  notation  which  may  appear  in  the  sequel  follows  the 
common  conventions  of  the  two  fields. 

In  this  paper,  we  use  v,  as,  y,  2  as  met  a- variables  for  the  logical  variables,  5,  t 
for  first-order  terms,  p,  q  for  predicate  symbols,  /,  g  for  function  symbols,  o,  6,  h 
for  atoms,  A  for  a  multi-set  of  atoms,  c,d  for  constraints,  C  for  a  multi-set 
of  constraints,  y?,  ^  for  first  order  formulas,  and  d  for  a  value  assignment,  or 
valuation,  to  a  set  of  variables.  Some  of  these  symbols  may  be  subscripted  or 
have  an  over-tilde  which  will  represent  a  finite  sequence.  For  instance,  x  stands 
for  a  sequence  of  the  form  (xi, sa, •  •  •» aJn)  for  some  natural  number  n.  When 
convenient,  we  will  use  the  tilde  notation  to  denote  sets  of  symbols  (as  opposed 
to  sequences).  Where  s  and  i  have  both  length  n,  the  equation  s  =  t  stands  for 
the  system  of  equations  {si  =  A  •  •  •  A  Sn  =  tn}- 

In  general,  var(y7)  is  the  set  of  y)*s  free  variables.  The  shorthand  stands 

for  the  existential  quantification  of  ail  the  free  variables  of  ip  that  are  not  con¬ 
tained  in  X,  while  3  (p  stands  for  the  existential  closure  of  ip. 

We  will  identify  union  of  multi-sets  of  formulas  with  their  logical  conjunction. 
We  will  also  identify  first-order  theories  with  their  deductive  closure.  Where  5,  T 
are  sets  of  i7-sentences,  for  some  signature  17,  Mod(jr)  is  the  set  of  all  the  i7- 
models  of  T.  The  notation  T  ^  ip  means  that  T  logically  entails  the  universal 
closure  of  y?,  while  5,  T  [=  y?  stands  for  5  U  T 

We  will  say  that  a  formula  ip  is  satisfiable  in  a  theory  T  iff  there  exists  a 
model  of  T  that  satisfies  3  ip. 

If  P  is  a  CLP  program  we  will  denote  with  P*  its  Clark  completion. 


1.2  Organization  of  the  Paper 

In  Section  2,  we  briefly  describe  the  original  CLP  scheme  and  motivate  the  need 
for  constraints  with  mixed  terms,  which  the  CLP  scheme  does  not  explicitly 
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accommodate.  In  Section  3,  we  mention  a  method  for  deriving  a  satisfiability 
procedure  for  a  combination  of  theories  admitting  mixed  terms  from  the  satis¬ 
fiability  procedures  of  the  single  theories.  In  Section  4,  we  explain  how  one  can 
use  the  main  idea  of  that  combination  method  to  extend  the  CLP  scheme  and 
allow  composite  constraint  domains  and  mixed  terms  over  them.  In  Section  5  we 
prove  some  soundness  and  completeness  results  for  the  new  scheme.  In  Section 
6,  we  summarize  the  main  contribution  of  this  paper ,  outlining  directions  for 
further  development. 

2  The  CLP  Scheme 

A  thorough  description  of  CLP(A'),  the  CLP  scheme,  can  be  found  in  [9].  We 
recall  here  that,  in  essence,  an  instance  of  CLP(Ar)  is  obtained  by  assigning  the 
parameter  X  with  a  quadruple  that  specifies  the  chosen  constraint  domain,  its 
axiomatization,  and  the  features  of  the  constraint  language.  More  specifically, 
Xi  :=  C^'T)  where  i!7  is  a  signature,  D  is  a  ^/-structure  representing  the 

constraint  domain  over  which  computation  is  performed,  C  is  the  constraint 
language,  that  is,  the  class  of  i7-formulas  used  to  express  the  constraints,  and 
T  is  a  first-order  i7-theory  (with  equality)  describing  the  relevant  properties  of 
the  domain.  A  number  of  assumptions  are  generally  made  about  X,  The  most 
important  are: 

-  E  contains  the  equality  symbol  which  V  interprets  as  the  identity  in  the 
underlying  domain. 

-  L  contains  an  identically  true  and  an  identically  false  predicate  and  a  set 

of  primitive  constraints. 

-  £  is  closed  under  variable  renaming  and  logical  conjunction. 

-  V  and  T  correspond  on  £,  that  is,  V  is  a.  model  of  T  and  every  formula  of 
£  that  is  satisfiable  in  V  is  satisfiable  in  every  model  of  T. 

For  some  applications  T  is  required  to  be  satisfaction  complete  with  respect 
to  £:  for  every  c  6  £,  either  T  |=  3  c  or  T  f=  c. 

Prolog  itself  can  be  seen  as  an  instance  of  the  CLP  scheme,  specifically  as 
CLP(.FT),  where  T'T  is  the  constraint  domain  of  finite  trees  represented  as 
first-order  terms.  Actually,  all  the  CLP(^)  systems  in  which  X  is  not  TT  or 
an  extension  of  it^  still  retain  the  possibility  of  building  uninterpreted  terms 
and  so  are  at  least  CL?{!FT^X)  systems.  Furthermore,  many  systems  sup¬ 
port  several  constraint  domains.  They  can  be  seen  as  CL^{X)  systems,  with 
X  :=  {A'l, . . .,  A'n},  where  the  Xi*s  are  built  over  disjoint  signatures  and  their 
constraints  are  processed  by  different,  specialized  solvers.  In  these  systems,  pred¬ 
icate  or  function  symbols  in  one  signature  are  applicable,  with  few  exceptions, 
only  to  (non- variable)  terms  entirely  built  with  symbols  from  the  same  signature. 

Thus,  although  in  one  way  or  another  all  CLP  systems  use  more  than  one 
constraint  domain,  they  do  not  freely  allow  mixed  terms  or  predicates,  that  is, 

^  Prologll,  for  instance,  works  with  rational  trees  instead  of  finite  trees. 
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expressions  built  with  symbols  from  different  signatures.  Meaningful  constraints 
over  such  heterogeneous  expression,  however,  arise  naturally  in  many  classes  of 
applications.  An  example  of  a  constraint  with  mixed  terms  in  the  theory  of  lists 
and  natural  numbers  is 

V  =  (®  +  9)  ::  2/  A  head[y)  >  2  +  w 

where  ::  is  the  list  constructor  and  head  returns  the  first  element  of  a  list.  An 
example,  adapted  from  [12],  in  the  theory  of  finite  trees  and  real  numbers,  is 

-  f{y))  #  /(2)  A  y  +  z<x  ■ 

Proper  instances  of  CLP(A')  cannot  deal  with  these  types  of  constraints  sim¬ 
ply  because  the  CLP  computational  paradigm  does  not  consider  them.  In  the 
rest  of  this  paper,  we  will  show  a  method  for  extending  the  CLP  scheme  to  a 
new  scheme,  MCLP(^),  that  offers  a  systematic  and  consistent  treatment  of  con¬ 
straints  with  mixed  terms.  Specifically,  we  will  show  how  to  convert  a  CLP  (AT) 
system  into  a  MCLP(-^)  system  which  operates  on  the  constraint  structure  gen¬ 
erated  by  a  suitable  combination  of  the  various  Afi’s. 

3  Combining  Satisfiability  Procedures 

The  main  idea  of  our  extension  is  to  adapt  and  use  in  CLP  a  well-known  method, 
originally  proposed  by  Nelson  and  Oppen  [13],  for  combining  first-order  theories 
and  their  satisfiability  procedures.  Although  the  method  applies  to  the  combi¬ 
nation  of  any  finite  number  of  theories,  for  simplicity  we  will  consider  the  case 
of  just  two  theories  here. 

Definition  1.  We  say  that  a  formula  is  in  simple  Conjunctive  Normal  Form  if 
it  is  a  conjunction  of  literals.  Given  a  i7- theory  T,  we  will  denote  with  sCNF{T) 
the  set  of  simple  Conjunctive  Normal  Form  i7-formulas. 

Definition  2.  A  consistent  theory  S-T  is  called  stably-infinite  iff  any  quantifier- 
free  i7-formula  is  satisfiable  in  T  iff  it  is  satisfiable  in  an  infinite  model  of  T . 

Let  Ti  and  T2  be  two  stably-infinite  theories  with  respective  signatures  i7i,  JO2 
such  that  i7i  n  272  =  0-*  The  simplest  combination  of  7i  and  Ti  is  the  (27i  U  272)- 
theory  7i  U  Ti  defined  as  (the  deductive  closure  of)  the  union  Ti  and  Ti. 

If  for  each  i  =  1, 2  we  have  a  procedure  Satj  that  decides  the  satisfiability 
in  Ti  of  the  formulas  of  sCNF{Ti),  we  can  generate  a  procedure  that  decides 
the  satisfiability  in  Ti  U  Ti  of  any  formula  y?  G  sCNF(Ti  U  Ti)  by  using  Sati 
and  Sat2  modularly.  Clearly,  because  of  the  expanded  signature,  y?  cannot  in 
general  be  processed  directly  by  either  satisfiability  procedure,  unless  it  is  of  the 
form  (pi  A  y?2 — call  it  separate  form — where  (pi  is  a  (possibly  empty)  formula  of 
sCNF(Ti).^  In  that  case,  each  Sati  will  process  (pi  separately. 


®  We  consider  the  equality  symbol  as  a  logical  constant. 

*  Since  ipi  docs  not  contain  symbols  from  the  other  signature,  we  say  that  it  is  pure. 
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If  (p  is  not  already  in  separate  form,  it  is  always  possible  to  apply  a  procedure 
that,  given  y?,  returns  a  formula  ^  which  is  in  separate  form  and  is  satisfied 
exactly  by  the  same  models  and  variable  assignments  that  satisfy  y?.**  Some 
examples  of  formulas  and  their  separate  forms  are  given  in  Figure  1, 


V  =  (aj  +  9)  ::  y  A /icad(y)  >  2 -j- u  (1) 

/(/(z)  -  /(y))  ^  f[z)  A  y  +  ®  (2) 

(r  =  zi  ::  y  A  head{y)  =  12)  A  (xi  =  a?  +  9  A  X2>  z-^u)  (3) 

(/(®i)  +  f(z)  A  aj2  =  /(x)  A  X3  =  /(y))  A  (xi  =  X2  —  X3  A  y  +  2  <  aj)  (4) 


Fig.  1.  Formulas  (3)  and  (4)  are  separate  forms  of  formulas  (1)  and  (2),  respec¬ 
tively. 


A  description  of  the  Nelson-Oppen  combination  procedure  is  not  necessary 
here.  For  our  present  purposes  it  is  enough  to  say  that  global  consistency  between 
the  separate  satisfiability  procedures  is  achieved  by  propagation,  between  the 
procedures,  of  the  entailed  equalities  between  the  variables  of  the  input  formula. 
That  this  approach  is  correct  and  sufficient  is  justified  by  the  model- theoretic 
result  given  in  the  following. 

Definitions.  If  P  is  any  partition  on  a  set  of  variables  V  and  R  is  the  cor¬ 
responding  equivalence  relation,  we  call  arrangement  of  V  (determined  by  P) 
the  set  ar(V’)  made  of  all  the  equations  between  any  two  equivalent  words  of 
V  and  all  the  disequations  between  any  two  non-equivalent  words.  Formally, 
ar{V)  :=  {x  =  y  I  as,  y  €  V  and  xRy}  U  {x  ^  y  |  aj,  y  G  V  and  not  xRy}. 

Proposition4  [18],  LetTi  andTi  he  as  above.  Consider  (pi  G  sCNF{Ti),  (p2  € 
sCNF{T2)  and  let  x  :=  var{<pi)  n  var(v?2).  Then,  (piA(p2  is  satisfiable  in  Ti  U7i 
iff  there  exists  and  arrangement  ar{x)  such  that  (pi  A  ar{x)  is  satisfiable  in  Ti 
and  (p2  A  ar(ic)  is  satisfiable  in  Ti* 


4  Extending  CLP{X) 

AVe  want  to  extend  the  CLP  scheme  so  that  we  go  from  a  language  of  type 
CLP(Af),  where  X  :=  {Xi,  *  *  *,  is  a  set  of  signature-disjoint  constraint  struc¬ 

tures,  to  a  language  of  type  MCLP(<V),  where  X  Is  a.  combination  of  the  previous 

*  The  gist  of  the  separation  procedure  is  to  repeatedly  replace  in  (p  each  term  t  of 
the  “wrong”  signature  with  a  new  variable  and  add  the  equation  a?  =  t  to  v>*  More 
details  can  be  found  in  [17]. 
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structures  in  the  sense  that  it  allows  any  computation  performed  in  CLP(*^ )  and, 
furthermore,  poses  no  signature  restriction  on  term  construction. 

The  reason  why  we  are  interested  in  combinations  of  satisfiability  procedures 
is  that  CLP  systems  already  utilize  separate  satisfaction  procedures,  the  con¬ 
straint  solvers,  to  deal  with  the  various  constraint  theories  they  support  and  so 
already  have  a  main  module  to  drive  the  goal  reduction  process  and  control  the 
communication  with  the  solvers. 

Intuitively,  if  we  rewrite  MCLP(^)  statements  in  a  separate  form  similar 
to  that  mentioned  earlier,  we  may  be  able  to  use  the  various  constraint  solvers 
much  the  same  way  the  Nelson-Oppen  combination  procedure  uses  the  various 
satisfiability  procedures.  Moreover,  the  machinery  we  will  need  for  a  MCLP(<?) 
system  will  be  essentially  the  same  we  would  need  for  a  corresponding  CLP(A') 
system.  The  only  necessary  addition,  to  realize  the  solvers  combination,  will  be  a 
mechanism  for  generating  equations  and  disequations  between  variables  shared 
by  the  different  solvers  and  propagating  them  to  the  solvers  themselves.  More 
precisely,  we  will  need  a  procedure  that,  each  time  a  new  constraint  is  given  to 
one  solver,  (a)  identifies  the  variables  that  that  constraint  shares  with  those  in 
the  other  solvers,  (b)  creates  a  backtrack  point  in  the  computation  and  chooses 
a  (novel)  arrangement  of  those  variables,  and  (c)  passes  the  arrangement  to  all 
the  other  solvers. 

We  clarify  and  formalize  all  this  in  the  following  subsections. 


4.1  The  Extended  Scheme 

The  first  issue  we  are  confronted  with  in  extending  the  CLP  scheme  is  the  im¬ 
possibility  of  fixing  a  single  domain  of  computation.  Recall  that  the  CLP  scheme 
puts  primacy  on  a  particular  structure  which  represents  the  intended  constraint 
domain.  The  combination  procedure  we  are  considering,  however,  combines  the- 
orieSy  not  structures® :  it  succeeds  when  the  input  formula  is  satisfiable  in  some 
model  of  the  combined  theory.  For  this  reason,  our  extension  will  use  as  “con¬ 
straint  domain”  a  whole  class  of  structures  instead  of  a  single  one.  In  this  re¬ 
spect,  our  scheme  is  actually  a  restriction  of  the  Hohfeld-Smolka  constraint  logic 
programming  framework  [7].  The  restriction  is  achieved  along  two  dimensions: 
the  constraint  language  and  the  set  of  solution  structures.  We  use  only  sCNF 
formulas  as  constraints  and  elementary  classes®  as  the  clsiss  of  structures  over 
which  constraint  satisfiability  is  tested.  In  particular,  the  class  associated  to  a 
given  MCLP(i^ )  language  is  the  set  of  the  models  of  the  union  of  the  component 
theories. 

Formally,  the  constraint  structure  X  for  the  MCLP(«^ )  scheme  is  defined  as 
the  tuple 

®  For  some  very  recent  work  on  the  combination  of  structures,  see  [3]. 

*  Recall  that  a  class  of  structures  is  called  elementary  if  it  coincides  with  the  set  of 
models  of  some  first-order  theory. 
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where  Si,.,,,  Sn  are  pairwise  disjoint  signatures  and  Ti  is  a.  stably-infinite  Si- 
theory,  for  each  i  G  The  constraint  theory  for  MCLP(^)  is  T  := 

71 U  •  •  •  U7;»,  the  constraint  language  is  sCNF{T),  and  the  set  of  solution  struc¬ 
tures  is  Mod(T). 

Next,  we  describe  a  logical  and  a  top-down  operational  model  of  MCLP(^) 
where  we  assume  that,  for  each  7<,  a  solver  Sat*  is  available  that  decides  satisfi¬ 
ability  of  sCNF  formulas  in  Ti. 


4.2  Logical  Semantics 

The  format  of  MCLP(.?)  statements  is  identical  to  that  of  CLP  statements 
except  that  mixed  constraints  are  allowed  with  no  restrictions.  As  a  conse¬ 
quence,  MCLP(.V)  adopts  CLP(^)’s  logical  semantics  for  both  its  programs 
and  their  completion.  The  only  difference  concerns  the  notation  used  to  describe 
MCLP(.V)  programs. 

Since  we  want  to  apply  the  available  solvers  modularly,  it  is  convenient  to 
look  at  each  MCLP(.^)  statement  as  if  it  had  first  been  converted  into  an  ap¬ 
propriate  separate  form.  After  we  define  the  computation  transitions,  the  careful 
reader  will  observe  that  it  is  not  necessary  to  actually  write  MCLP(^ )  programs 
in  separate  form  because  a  separation  procedure  can  be  applied  on  the  fly  during 
subgoal  expansion.  We  use  the  separate  form  here  simply  for  notational  conve¬ 
nience.  Specifically,  instead  of  a  standard  CLP  rule  of  the  form,  p(x)  B,  as 
defined  in  [9]  for  instance,  we  consider  the  separate  form 

p{x)  -t-  B 

obtained  by  applying  the  procedure  mentioned  in  Section  3  to  the  body  of  the 
rule*^.  Analogously,  instead  of  a  goal  G  we  consider  the  corresponding  goal  G, 

It  should  be  clear  that,  under  the  CLP  logical  semantics,  a  MCLP(^)  state¬ 
ment  and  its  separate  form  are  equivalent. 


4.3  Operational  Semantics 

We  will  only  consider  the  case  of  two  component  theories  here,  as  the  7i-component 
case  is  an  easy  generalization. 

As  with  CLP(A'),  computation  in  MCLP(.?)  can  be  described  as  a  sequence 
of  state  transitions.  Each  state  in  turn  is  described  by  either  the  symbol  fail  or 
a  tuple  of  the  form  {A,  Ci,  C2}  where  A  is  a  set  of  pure  atoms  and  constraints, 
and  for  i  =  1,  2  Ci,  the  constraint  store,  is  a  set  of  Bi-constraints.®  We  assume 
the  presence  of  a  function,  select,  that  when  applied  to  the  first  element  A  of 

^  This  is  always  possible  as  the  body  of  a  MCLP(.?)  rule  is  a  sCNF  formula. 

•  For  simplicity,  we  have  decided  to  ignore  the  issue  of  delayed  constraints  here.  We 
would  like  to  point  out  however  that  our  extension  could  be  easily  applied  with 
comparable  results  to  an  operational  model  including  delayed  constrmnts  such  as 
the  one  described  in  [9]. 
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a  transition  returns  a  member  of  A  non-deterministically.  State  transitions  are 
defined  as  follows. 

1.  (A,CuC2)-^r{AuB’-a{x),Cia2) 

where  a{x)  :=  select{A)  is  an  atom,  the  program  P  contains  the  rule  a(y)  5, 
and  €[  :=  C*  U  *  =  y  for  i  =  1, 2. 

2.  (A,  Cu  C2)  — 

where  a(®)  :=  select{A)  is  an  atom  and  no  rule  in  P  has  a  as  the  predicate 
symbol  of  its  head. 

3.  {ACi,C2}-^c{A-c,C[,C^2) 

where  c  :=  select{A)  is  a  constraint  literal  and,  for  i  =  1, 2,  C/  :=  Cj  Uc  if  c  is  a 
iTi -constraint,  C-  =  Ci  otherwise. 

4.  (A,CuC2)^.{A,ClCi,) 

where  or(v)  is  an  arrangement  of  the  variables  shared  by  Ci  and  C2  a-ud  := 
Ci  U  ar{v)  and  Sati{Ci)  succeeds  for  i  =  1, 2. 

5.  {A,CuC2)^,  faU 

where  ar(i;)  is  an  arrangement  of  the  variables  shared  by  Ci  and  C2,  C-  := 
Ci  U  or(v)  for  i  =  1, 2,  and  either  of  Sati((7i)  or  Sat2{C2)  fails. 

Similarly  to  CLP(Af),  transitions  of  type  are  just  goal  reduction  steps. 
The  difference  here  is  that  the  variable  equalities  produced  by  matching  the 
selected  predicate  with  the  head  of  some  rule  go  to  both  constraint  solvers  as, 
by  definition,  an  equality  predicate  with  variable  arguments  belongs  to  both  Ci 
and  £2- 

Transitions  of  type  feed  the  constraint  solvers  with  a  new  constraint, 
where  each  constraint  goes  to  the  relative  solver  (variable  equalities  going  to 
both  solvers). 

Transitions  of  type  differ  more  significantly  from  the  corresponding  tran¬ 

sitions  in  CLP(Af)  as  they  actually  implement,  in  an  incremental  fashion,  the 
combination  procedure.  They  deserve  a  more  detailed  explanation  then. 

In  terms  of  the  procedure,  for  every  — transition,  we  consider  the  constraint 
stores  Cl  and  C2  as  the  i-pure  halves  of  sCNP  formulas  whose  satisfiability  must 
be  checked.  For  each  constraint  store,  we  use  the  constraint  solver  “as  is”  but 
we  make  sure  that  global  consistency  information  is  shared  by  the  two  solvers  by 
choosing  an  arrangements  of  the  variables  they  have  in  common.  It  goes  without 
saying  that,  like  the  transitions  of  type  — transitions  of  type  — are  non- 
deterministic.  With  the  first  type,  the  choice  is  among  the  possible  reductions  of 
the  selected  predicate,  with  the  second,  it  is  among  the  possible  arrangements  of 
the  shared  variables.  In  actual  implementations  of  the  scheme  then,  backtracking 
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mechanisms  similar  to  those  used  for  transitions  must  be  used.  For  space 
limitations  we  cannot  include  a  more  complete  discussion  on  the  implementation 
of  -^r  transitions  and  its  various  possible  optimization.  The  interested  reader  is 
again  referred  to  [17]. 

The  concepts  of  derivation,  final  state,  fair/failed/successful  derivation,  an¬ 
swer  constraint,  and  computation  tree  can  be  defined  analogously  to  the  corre¬ 
sponding  CLP  (A')  concepts  (see  [9]). 

5  Computational  Properties  of  MCLP(-?) 

To  discuss  the  main  computational  properties  of  MCLP(^)  it  is  necessary  to 
specify  a  more  detailed  operational  semantics  than  the  one  given  in  the  previous 
section.  Since  any  implementation  of  MCLP(^)  is  a  deterministic  system,  a 
particular  computation  rule  has  to  be  defined.  For  us,  this  amounts  to  specifying 
the  behavior  of  the  select  function  and  the  order  in  which  the  various  types  of 
transitions  are  applied.  We  will  need  to  further  restrict  our  attention  to  specific 
classes  of  MCLP(^)  systems  to  prove  some  of  the  properties  of  MCLP(^). 

Definitions.  Let  -4c,  be  the  two-transition  sequence  We  say  that  a 

MCLP(«^)  system  is  quick- checking  if  all  of  its  derivations  are  sequences  of  -4, 
and  -4c,  transitions  only. 

A  quick-checking  CLP(^)  system  verifies  the  consistency  of  the  constraint  store 
immediately  after  it  modifies  it.  Analogously,  a  quick-checking  MCLP(^)  system 
verifies  the  consistency  of  the  union  of  all  the  constraint  stores  (by  means  of 
equality  sharing  among  the  solvers)  immediately  after  it  modifies  at  least  one  of 
them. 

Definition  6.  A  MCLP(^)  system  is  ideal  if  it  is  quick-checking  and  uses  a  fair 
computation  rule.® 

In  both  schemes,  we  can  define  the  concept  of  finite  failure  if  we  restrict 
ourselves  to  the  class  of  ideal  systems.  We  say  that  a  goal  G  is  finitely  failed  for 
a  program  P  if,  in  any  ideal  system,  every  derivation  of  G  in  P  is  failed. 


5.1  Comparing  CLP(Ar)  with  MCLP(^ ) 

To  show  that  the  main  soundness  and  correctness  properties  of  the  CLP  schema 
lift  to  our  extension,  we  will  consider,  together  with  the  given  MCLP(.f )  system, 
a  corresponding  CLP(A')  system  that,  while  accepting  the  very  same  programs, 
supports  the  combined  constraint  theory  directly  (i.e.,  with  a  single  solver)  and 
show  that  the  two  systems  have  the  same  computational  properties. 

•  This  definition  diiTers  from  that  given  in  [9]  because  we  adopt  a  slightly  different 
definition  of  derivation  (see  [17]  for  more  details),  but  it  refers  to  the  same  class  of 
systems. 
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Actually,  MCLP(^)  systems  cannot  have  a  corresponding  CLP {X)  system 
since  the  original  scheme  and  ours  define  constraint  satisfiability  in  a  different 
manner.  In  CLP(A'),  the  satisfiability  test  on  the  constraint^ store  is  successful  if 
the  store  is  satisfiable  in  the  fixed  structure  V.  In  MCLP(*f )  instead,  the  satis¬ 
fiability  test  is  successful  if  the  (union  of  all)  constraint  store(s)  is  satisfiable  in 
any  structure  among  those  modeling  the  constraint  theory.  However,  correspon¬ 
dence  becomes  possible  if  we  “relax”  the  CLP(Ar)  system  by  testing  satisfiability 
within  the  class  of  structures  Mod(T),  where  T  is  the  chosen  constraint  theory, 
instead  of  a  single  structure. 

The  relaxed  CLP  scheme  is  more  general  than  the  original  scheme  but  less 
general  than  that  proposed  by  Hohfeld  and  Smolka  in  [7].  In  fact,  its  soundness 
is  derivable  as  a  consequence  of  the  soundness  of  Hohfeld  and  Smolka’s.  Its  com¬ 
pleteness,  however,  cannot  be  derived  from  their  scheme  because  it  is  essentially 
a  consequence  of  the  choice  of  a  first-order  constraint  language  and  theory,  which 
Hohfeld  and  Smolka  do  not  require. 

In  Section  5.2,  we  will  see  that  not  only  is  the  relaxed  CLP  scheme  sound 
and  complete,  but  also,  and  more  importantly,  it  has  logical  properties  no  weaker 
than  those  exhibited  by  the  CLP  scheme. 

It  should  not  be  difficult  to  see  now  that,  once  we  have  shown  that  CLP(*V) 
maintains  its  nice  properties  even  with  a  satisfiability  test  over  an  ^elemen¬ 
tary  class  of  structures,  soundness  and  completeness  results  of  MCLP(<?)  easily 
follow— the  intuitive  justification  being  that  it  is  immaterial  whether  we  check 
for  satisfiability  in  the  union  constraint  theory  utilizing  a  combined  procedure 
or  a  non-combined  one. 


5.2  Soundness  and  Completeness  of  Relaxed  CLP 

We  will  now  consider  CLP(Ar)  systems  that  are  instances  of  the  relaxed  CLP 
scheme  mentioned  earlier.  For  these  systems,  the  tuple  X  is  defined  as  in  Section 
2  with  the  difference  that  D  is  replaced  by  Mod(T),  and  the  satisfiability  test 
succeeds  if  and  only  if  the  input  constraint  is  satisfiable  in  some  clement  of 
Mod{T). 

Assuming  a  relaxed  CLP(<V)  system  with  constraint  language  £,  and  theory 
T,  we  have  formulated  the  following  results  after  those  given  in  [9]  for  CLP  (A'). 
For  lack  of  space  we  forgo  their  proofs  here;  they  are  very  similar,  however,  to 
those  given  in  [8]  and  [11]  for  the  corresponding  CLP  results  and  can  be  found 
in  [17]. 

Proposition  7  Soundness.  Given  a  program  P  and  a  goal  G: 

1.  If  G  has  a  successful  derivation  with  answer  constraint  c,  then  P,T  \=  c  G, 

2.  When  T  is  satisfaction  complete  wrt  to  C,  if  G  has  a  finite  computation  tree 
with  answer  constraints  ci, . . . ,  Cn,  then  P*,  T  \=  G  ^  ciV  •  •  •W  Cn> 

Propositions  Completeness.  Given  a  program  P,  a  simple  goal  G  and  a 
constraint  c: 
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1.  J/P,  T  1=  c  ->  G  and  c  is  satisfiable  in  T,  then  there  are  n>0  derivations  of 
G  with  respective  answer  constraint  ci, . . . ,  c„  such  that  T  (=  c  ci V  -  •  -Vcn. 

2,  When  T  is  satisfaction  complete  wrt  to  C,  if  P*^T  ^  G  -H-  ci  V  •  •  •  V  Cn 
then  G  has  a  computation  tree  with  answer  constraints  c^, . . . such  that 
r  hci  V-*-VCn  ^ci  V*“V4. 

In  [8],  Jaffar  and  Lassez  show  that  Negation-as- Failure  can  be  used  correctly 
in  their  scheme  provided  that  the  constraint  theory  is  satisfaction  complete  with 
respect  to  the  constraint  language.  We  discovered  that  we  can  still  use  negation 
as  failure  properly  in  the  relaxed  CLP  scheme  and,  in  addition,  we  do  not  need 
satisfaction  completeness  of  the  component  theories  at  all.  As  before,  a  sufficient 
condition  for  this  result  is  that  we  use  a  first-order  constraint  language. 

Proposition  9  Soundness  and  Completeness  of  Negation-as-Pailure.  In 
an  ideal  system,  a  goal  G  is  finitely  failed  for  a  program  P  iff  P*  ,T  -iG. 

5.3  Main  Results 

In  the  following,  we  consider  a  MCLP(^ )  system,  where  X  is  defined  as  in  Sec¬ 
tion  4  but  limited,  again  for  simplicity,  to  the  combination  of  only  two  stably- 
infinite  theories.  We  will  assume  that,  while  the  system  satisfies  the  general  im¬ 
plementation  requirements  given  earlier,  its  computation  rule  is  flexible  enough 
with  respect  to  the  order  in  which  the  various  transitions  can  be  applied.  Such 
assumption  is  not  necessary  for  our  results  but  makes  their  proofs  easier  and 
more  intuitive. 

We  will  also  assume  that  programs  and  goals  are  all  given  in  separate  form. 
For  notational  ease,  we  will  use  ’-^rfc  l-o  denote  either  a  — or  a  transition. 
We  start  with  some  easy  to  prove  lemmas. 

Lemma  10.  If  goal  G  has  a  successful  derivation  in  a  MCLP(X )  program  P, 
then  it  has  a  successful  derivation  with  the  same  answer  constraint  and  such 
that  all  of  its  transitions  ore  —^r/e  transitions,  except  the  last  one  which  is  a  — 
transition. 

Essentially,  the  lemma  states  that  a  successful  derivation  can  be  always  re¬ 
arranged  into  a  derivation  of  the  form  (G,0,0>  4,/c  <0,Gi,C2>  (0,G(,GJ) 

by  first  reducing  the  goal  to  the  empty  set  and  then  testing  the  consistency  of 
the  collected  constraints. 

Lemma  10  also  entails  that  a  successful  derivation  in  MCLP(;P)  is  not  just 
a  finite  derivation  not  ending  with  fail,  but  one  whose  answer  constraint  is 
satisfiable  in  the  union  theory.  In  fact,  according  to  the  MCLP(;0)  operation 
model,  a  necessary  condition  for  the  above  derivation  to  be  successful  is  that  C- 
be  satisfiable  in  TJ  for  i  =  1, 2.  From  Prop.  4  then,  we  can  infer  that  3_^(g)(G{  A 
GJ),  the  answer  constraint  of  the  derivation,  is  satisfiable  in  7i  UTJ. 

Lemma  11.  Given  a  MCLPfX J  program  P  and  a  goal  G,  if  a  derivation  of  the 
form  (G,  0, 0)  (0,  Gi,  Ga)  exists  in  P,  then  T,  P  (=  (Gi  A  Ga)  ->  G. 
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Proposition  12  Soundness  of  MCLP(^ ).  Given  a  program  P  and  a  goal  G, 

1.  If  G  has  a  successful  derivation  with  answer  constraint c,  then  P^T  ^  c  G. 

2.  When  T  is  satisfaction  complete  wrt  to  C,  if  G  has  a  finite  computation  tree 
with  answer  constraints  ci, . . , ,  Cn,  then  P*,  T^=G44CiV*'*Vc„. 

Proof,  Let  ®  :=  vaf(G). 

1.  By  Lemma  10,  we  can  assume  without  loss  of  generality  that  the  derivation 

of  G  is  of  the  form  {C?,0,0>  \/c  (0, 01,^2)  (0,Ci,Ci)  where  c  is  then 

(Cl  A  CJ).  Recalling  the  definition  of  transitions,  it  is  immediate  that 
j=  c  (Cl  A  C2).  The  claim  is  then  a  direct  consequence  of  Lemma  11. 

2.  [sketch]  Let  us  call  the  MCLP(^ )  system  5  and  assume  a  corresponding 

relaxed  CLP  system  5rei-  With  no  loss  of  generality  we  assume  that,  for  each 
i  €  {1, .  * .  1  ^}»  derivation  in  5,  having  answer  constraint  c*,  is  of  the  form 
(C,0,0>  \fc  (0,C'ii,C2i)  (0,C'li,C5,.>.  Since  5rei  has  the  same  computa¬ 

tional  rule  as  5,  for  each  <5,*  there  is  a  corresponding  derivation  in  5rei  of 
the  form  (C,  0, 0}  A,/c  (0,  A)  (0i  A)»  where  A  =  Cij  U  Ca*. 

Let  ^  be  an  equivalence  relation  over  {1, . . . ,  n}  such  that  i  j  iff  8i  and  6j 
coincide  up  to  the  last  transition.  Notice  that  =  h{Sj)  if  i  ~  j.  Recalling 
the  definition  of  transitions,  it  should  not  be  difficult  to  see  that  for  each 
j  6  {1,  •  •  M  ^}j  b]  equivalence  class  of  j  with  respect  to  the  following 

chain  of  logical  equivalences  holds, 

\J  a  ^\J  3_i  (Cl,  A  Ci,)  ^  (Cij  A  C2,)  ^  3-i  Dj,  (5) 
»€[?■]  »€b] 

By  an  analogous  of  Lemma  10  for  the  relaxed  CLP  scheme,  we  can  show  that 
the  tree  made  by  all  the  h(5iys  above  is  indeed  the  finite  computation  tree  of  G 
in  5rei*  By  the  soundness  of  the  relaxed  CLP  scheme,  we  then  have  that 

P’,T\=G^  V  3-*  A-  (6) 

J 

The  claim  follows  then  immediately,  combining  (5)  and  (6)  above.  □ 

The  following  lemma  is  also  easy  to  prove. 

Lemma  13.  Consider  a  program  P,  a  MCLP(X)  system  5,  and  a  correspond¬ 
ing  relaxed  CLP  system  5rei*  Then,  for  any  transition  t  in  Sni  of  the  form 
(A^C)  -)-,/c  {A\C*)  there  is  a  transition  t'  in  S  of  the  form  (A, Ci,C2)  ->r/c 
(A',  Cl,  CD  such  ihot  r  (=  C'  44  Cl  A  C2  implies  T  [=  C'  44*  Cl  A  CJ. 

Proposition  14  Completeness  of  MCLP(^).  Given  a  MCLP(X)  system^  a 
program  P,  a  simple  goal  G  and  a  constraint  c, 

1,  //P,  T  1=  c  -4  C  and  c  is  satisfiable  in  T,  then  there  aren>Q  derivations  of 
G  with  respective  answer  constraint  ci, . . . ,  Cn  such  that  T^=c^ciV‘**Vcn. 
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2.  When  T  is  satisfaction  complete  wrt  to  if  P*^T  [=  G  44  ci  V  •  •  •  V  Cn, 

then  G  has  a  computation  tree  with  answer  constraints  c^, . . . ,  cj„  such  that 

r  t=ClV..-VCn44ciV.-.Vc;„. 

Proof.  1.  Let  us  call  the  MCLP(^ )  system  5  and  assume  a  corresponding  relaxed 
CLP  system  Srth  Let  x  :=  var(G).  To  simplify  the  notation,  if  ^  is  a  successful 
derivation,  we  will  denote  its  answer  constraint  by  ans(<y). 

Now,  by  the  completeness  of  the  relaxed  CLP  scheme,  there  exists  a  set  D 
of  successful  derivations  of  G  in  5rei  such  that  T  |=  c  We  show 

that,  for  each  S  €  there  is  a  set  Ds  of  successful  derivations  of  G  in  5  such  that 
T  ans(^)  44  ons(7).  Then,  the  claim  follows  immediately  by  taking 

Cl  V  •  •  •  Vc„  as  V^eJ3(VT6i.,  ofwW)- 

Consider  any  S  E  D.  We  generate  a  derivation  S'  in  S  with  initial  state 
(G,  0, 0)  such  that  S'  has  a  — Vr/c  transition  for  each  -^r/c  transition  of  S  in  the 
way  given  in  Lemma  13,  and  an  empty  transition  for  each  -4,  transition  of  S. 
Using  Lemma  13  and  the  fact  that  transitions  in  5rei  preserve  equivalence  of 
the  constraint  stores,  it  is  easy  to  show  that  if  (0,  C)  is  the  final  state  of  S^  then 
the  last  state  of  S'  has  the  form  (0,  Gi,  C2)  with  T  ^  G  44  Gi  A  G2. 

We  obtain  the  set  Ds  mentioned  above  by  completing  S'  with  one  — tran¬ 
sition  from  (0,  Gi,G2)  for  each  possible  arrangement  of  the  shared  variables 
V  between  Gi  and  C2  that  is  consistent  with  both  stores.  Observe  that,  since 
Gi  U  G2  is  satisfiable,  for  being  equivalent  to  the  final  constraint  store  of  a  suc¬ 
cessful  derivation,  we  are  guaranteed  by  Prop.  4  that  at  least  one  arrangement 
of  V  is  consistent  with  both  Gi  and  C2  and,  consequently,  that  Ds  is  non-empty. 
It  follows  that,  for  every  7  €  Ds^  T  f=  0715(7)  44  (Gi  A  G2  A  or(v)),  for  some 
2irrangement  or(v). 

Observing  that  the  disjunction  of  all  the  arrangements  of  r  is  a  valid  formula, 
it  is  then  easy  to  deduce  the  following  chain  of  logical  equivalences  in  T 

G  44  3-£  {Cl  A  G2)  44  3_£  (Gi  A  G2  A  Vor(tf) 

++  Va.(«)  3-*  A  (7j  A  ar(v))  «  <ins(7) 

which  concludes  our  proof. 

2.  The  result  follows  as  a  consequence  of  the  corresponding  result  for  relaxed 
CLP  and  the  construction  in  the  proof  of  case  1  above.  □ 

Observe  that  the  necessity  to  consider  multiple  derivations  to  show  the  com¬ 
pleteness  of  the  system  is  not  generated  is  already  present  in  the  CLP  scheme 
itself.  Our  extension,  however,  may  increase  the  number  of  necessary  derivations 
because  — transitions  can  generate  multiple  successful  derivations,  instead  of 
just  one,  whenever  more  than  one  arrangement  of  variables  is  consistent  with 
both  the  constraint  stores. 

By  essentially  the  same  arguments  given  for  the  relaxed  CLP  scheme,  we  can 
also  prove  soundness  and  completeness  of  Negation-As- Failure  in  MCLP(if ). 

Proposition  15  Negation-as-Failure.  In  an  ideal  MCLP(X)  system,  a  goal 
G  is  finitely  failed  for  a  program  P  iff  P*yT  ^  -iG. 


449 


6  Conclusions 

In  this  paper,  we  have  described  a  way  of  extending  the  CLP  (A')  scheme  to  admit 
constraint  theories  generated  as  the  union  of  several  stably-infinite  theories  with 
pairwise  disjoint  signatures.  The  main  idea  of  the  extension  is  to  incorporate 
in  the  scheme  a  well-known  method  of  obtaining  a  satisfiability  procedure  for  a 
union  theory  as  the  combination,  by  means  of  variable  equality  sharing,  of  the 
satisfiability  procedures  of  each  component  theory. 

By  adopting  a  non-deterministic  equality  sharing  mechanism,  we  have  been 
able  to  prove  that  the  main  properties  of  our  extension  directly  compare  to 
those  of  the  original  scheme,  provided  that  the  CLP(Af)  consistency  test  on  the 
constraint  store  is  relaxed  from  satisfiability  in  a  single  structure  to  satisfiability 
in  an  elementary  class  thereof. 

Specifically,  we  have  first  claimed  that  the  relaxation  of  the  satisfiability  test 
(which  gives  rise  to  what  we  called  a  relaxed  CLP  scheme)  does  not  modify  the 
original  soundness  and  completeness  properties,  even  in  the  case  of  the  negation- 
as-failure  inference  rule.  Then,  we  have  shown  how  the  results  given  for  the 
relaxed  CLP  scheme  lift  to  our  extension. 

Finally,  we  would  like  to  point  out  the  advantages  of  adopting  a  non  determin¬ 
istic  version  of  the  original  equality-sharing  mechanism  by  Nelson  and  Oppen. 
On  the  theoretical  side,  our  version  fits  rather  nicely  into  the  CLP  scheme  as  it 
simply  adds  another  level  of  “don’t-know”  non-determinism  (corresponding  to 
the  choice  of  a  variable  arrangement)  into  the  computational  paradigm.  On  the 
practical  side,  where  incremental  solvers  are  already  available  for  each  constraint 
theory,  not  only  does  this  scheme  preserve  their  incrementality,  a  key  computa¬ 
tional  feature  for  the  implementation  of  any  CLP  system,  but  also  allows  one  to 
use  them  as  they  are,  with  no  modification  whatsoever  to  their  code  or  interface. 
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Abstract.  This  paper  presents  the  formalization  of  the  symbolic  simulation 
and  analysis  technique  for  hybrid  systems  developed  in  [9,  7,  8].  The  main 
advcintage  of  this  technique  is  the  close  relation  between  the  hj'brid-systems 
model  Hybrid  Automata  [1,  8]  and  the  execution  model  CLP(72.)  [3].  Our 
rule-based  description  is  naturally  suited  for  hybrid  systems  allowing  (a)  to 
lift  CLP(7l)  definitions  and  results  for  the  theory  of  hybrid  systems,  and 
therefore  (b)  to  apply  —  in  addition  to  forward/backward  fixpoint  com¬ 
putation  and  symbolic  model-checking  —  CLP (72.)  intelligent  search  and 
backtracking  procedures  [2]  in  their  analysis,  since  the  depth-first  search 
strategy  of  CLP (72.)  is  incomplete  on  infinite  trees.  These  techniques  were 
implemented  in  part  on  top  of  the  CLP (72.)  prototype  system  [4].  We  illus¬ 
trate  our  method  with  a  variant  of  the  reactor  temperature  control  s.vstem 
from  [1].  More  realistic  examples  can  be  foimd  in  [9,  7,  8,  6]. 


1  Hybrid  Systems 

Hybrid  systems  [1,  8]  describe  the  behavior  of  physical  components  that  interact 
directly  with  an  environment.  Such  systems  usually  consist  of  both  a  discrete  com¬ 
ponent  and  an  analog  component.  They  model  continuous  state  changes  by  means 
of  differential  equations,  or  inequations,  over  time  intervals  (analog  behavior)  as 
well  as  discrete  state  changes  by  means  of  nondeterministic  guarded  assignments 
that  are  instantaneous  (discrete  behavior).  Typical  examples  of  hybrid  systems  are 
real-time  process-control  systems  such  as  automated  factories  or  automated  trans¬ 
portation  systems.  The  correctness  of  such  systems  is  more  subtle  and  harder  to 
verify  than  that  of  traditional  systems  because  of  their  real-time  aspect. 

In  this  paper  we  show  how  a  hybrid  system  H  modeled  by  a  hybrid  automaton 
[1],  Sect.  1.1,  can  be  seen  as  a  CLP(7^)  program  [4],  Sect.  1.2.  We  also  describe 
some  CLP(P)  systems  for  evaluating  some  classes  of  hybrid  systems.  Sect.  2,  al¬ 
lowing  to  verify  reachability,  safety,  liveness,  time-bounded  and  duration  properties 
of  H,  written  in  the  Integrator  Computation  Tree  Logic  (ICTL)  [1],  by  applying 
top-down/bottom-up  evaluation  methods,  symbolic  model-checking  and  intelligent 
search  and  backtracking  strategies,  Sect.  3. 
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1.1  Hybrid  Automata  Representation 

Informally,  a  hybrid  automaton  consists  of  a  finite-state  automaton,  real-valued 
variables,  discrete  relations,  activities,  invariants  and  an  initial  condition  imposed 
on  the  initial  location.  The  finite-state  automaton  describes  the  system  structure- 
the  synchronization  symbols  together  with  the  discrete  relations  describe  the  inter¬ 
action  of  the  system  with  its  environment.  The  activities  are  conjunctions  of  differ¬ 
ential  (in)equations  over  the  real-valued  variables  and  describe  the  time-dependent 
changes  of  the  system.  The  invariants  are  linear  formulae  permitting  that  the  sys¬ 
tem  remains  in  a  location  while  its  invariant  is  true.  The  initial  condition  describes 
the  set  of  possible  values  for  the  system  when  it  starts  in  the  initial  location. 

Example  1  Reactor  Temperature  Control  System.  The  system  controls  the  temper¬ 
ature  of  a  reactor  core.  It  consists  of  three  hybrid  automata,  of  a  reactor  core 
controller  CONTROLLER  and  two  control  rods  ROD,-  (i  =  1,2)  (cf  Fig  1)  The 
real-valued  variable  0  describes  the  temperature.  The  goal  is  to  keep  0  between  a 
minimal  temperature  0„  and  a  maximal  temperature  0m.  U0  reaches  0m,  then  0  is 
to  decrease  by  introducing  one  of  the  control  rods  into  the  reactor  core.  At  the  be¬ 
ginning  0  is  0„,  degrees  and  both  control  rods  are  outside  of  the  reactor  core.  In  this 
case  0  raises  according  to  the  differential  equation  0  =  ^  -f  50,  location  'nojrod’.  0 
decre^es  according  to  the  differential  equations  ^  ^  -  56  (location  Vodi’)  and 

®  ~  (location  'rodT)  depending  on  the  control  rod  used.  A  control  rod 

may  be  used  again,  if  T  >  0  time  units  have  elapsed  since  it  was  last  removed.  If  0 
cannot  decrease  because  no  control  rod  is  available,  then  a  shutdown  of  the  reactor 
IS  necessary.  A  shutdown  of  the  system  should  be  prevented.  The  value  of  the  clock 
Xi  represents  the  time  having  elapsed  since  the  last  use  of  the  !-th  control  rod. 


CONTROLLER 


Fig.  1.  RODi,  ROD2  and  CONTROLLER  for  the  temperature  control  system 
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Finite-state  Automaton.  The  locations  in  ROD:  have  the  meaning:  ’ow/,’ 
—  j-th  control  rod  is  outside  (inside)  of  the  reactor  core.  The  transitions 
have  the  meaning:  (out:,  add,-,  m:)  ((m,-,  remove,-,  out,)) —  t-th  control  rod  is  in¬ 
troduced  (removed)  into  (from)  the  reactor  core.  In  CONTROLLER  the  loca¬ 
tions  have  the  meaning:  ’rod,-’  —  i-th  control  rod  is  inside  of  the  reactor  core; 
’no_rod’  —  there  is  no  control  rod  inside  of  the  reactor  core;  shutdown^  ~~  a  shut¬ 
down  of  the  reactor  occurs.  The  transitions  have  the  meaning:  {nojrod^addi^rodi) 
((rod,-,  remove,-,  7zo_rod))  —  ?’-th  control  rod  is  introduced  (removed)  into  (from) 
the  reactor  core;  (no_rod,  e,  shutdown)  —  shutdown  of  the  reactor  (e  is  the  empty 
synchronization  symbol),  "‘outd  and  ’’nojrod'^  are  the  initial  locations  for  ROD,  and 
CONTROLLER,  respectively. 

Data  Variables.  The  real-valued  variables  x  =  xi,...,a:n  are  called  data  vari¬ 
ables.  The  natural  number  n  is  the  dimension  of  the  hybrid  automaton.  Xj  and 
9  are  data  variables  of  RODj  and  CONTROLLER,  respectively.  In  ROD,  T  is  a 
parameter.  In  CONTROLLER  6m  and  6m  are  parameters  as  well. 

Discrete  Behavior.  In  addition,  a  hybrid  automaton  consists  of  nondetermin- 
istic  guarded  assignments  (ndga^s)  of  the  form  V’(x),  'a'  >->  x'  :=  f(x),  where 
V’(x)  is  a  convex  linear  formula  over  x,  a  is  a  synchronization  symbol  and  f(x)  = 
(/i(x), . . . ,  /n(x))  is  a  sequence  of  linear  terms  /:(x)  over  x.  A  ndga  in  a  transition 
(I,  a,  I')  is  instantaneous  and  may  be  taken  when  the  convex  linear  formula  V’(x) 
is  true  and  the  synchronization  symbol  a  occurs.  Then  the  assignments  x'  :=  f(x) 
to  the  prime  data  variables  x'  are  made.  A  ndga  ^(x),  'a'  x'  :=  f(x)  for  a 
transition  generates  a  binary  discrete  relation  a(/^c^i/)(x,  x')  C  x  M” 

between  the  values  before  (nonprime  data  variables  x)  and  after  (prime  data  vari¬ 
ables  x')  the  transition.  In  the  graphical  representation  of  hybrid  automata  empty 
synchronization  symbols  e,  true  formulae  and  identity  assignments  are  left  out.  In 
ROD,-,  Xi  >  T,  'add[  >->  x\  :=  Xi  and  true^  'remove^  x^  :=  0  are  the  ndga’s 
for  the  transitions  (out,-,  add,,  m,-)  and  (m,-,  remove,-,  out,),  respectively.  In  CON¬ 
TROLLER,  6  =  6m,  'addi  6'  :=  6,  6  =  6m,  'remove^  >->  6'  6  and 

6  =  6m,  V  >-)■  6'  :=  6  are  the  ndga’s  for  the  transitions  (no_rod,  add,-,  rod,), 
[rodi^removei^nojrod)  and  (no.rod^e^  shutdown)^  respectively.  For  the  parameters 
T,  6m  and  6m  following  identity  assignments  can  be  considered:  T'  :=  T,  6!^  := 
6m.,  and  O'M-  6m-  The  generated  discrete  relations  have  the  form; 

Q'(o«t,,ad£ii,in,)  ^,)  =  ^  ^  A  a:,-  =  X,-,  OH^in^^removei,outi){^i,  ^i)  = 

^{no-rod,addt,rodi){,^,^  )  ~  ^{no^od,t^shutdo'wn)ifi 0  =  6m  A^ 

^'{rodi,removei,nojrod){^,^  )  =  6  =  6m  A  0  —  6  .  (1) 

Analog  Behavior.  For  each  location  I  there  exists  an  invariant  Inv(/)  and  an 
activity  Act(/).  An  invariant  is  a  convex  linear  formula  over  x;  an  activity  is  a 
conjunction  of  differential  (in)equations  of  the  form  x#/(x),  where  f{x)  is  a  linear 
term,  x  is  a  data  variable  and  #6  A  location  describes  the  time- 

dependent  actions  of  the  system.  Here  the  system  may  continuously  change  the 
values  of  x  according  to  the  differential  (in)equations  x#/(x)  if  for  these  values  the 
invariant  is  true.  A  hybrid  automaton  is  linear  if  for  each  differential  (in)equation 
x#/(x),  /(x)  is  a  constant  term.  Otherwise,  the  hybrid  automaton  is  non-linear. 
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In  the  graphical  representation  of  hybrid  automata  true  formulae  (invariants)  and 
zero  differential  equations  i:  =  0  are  omitted.  For  instance,  in  ROD,-  all  locations 
have  the  invariant  true  and  the  activities  Act  (out,  )  =  i,-  =  1  and  Act(m,  )  = 
X,  =  0.  In  CONTROLLER  the  locations  have  the  invariants  Inv(rod,*)  =  6m  <0^ 
Inv(no.rod)  =  6  <  6m  and  Inw [shutdown]  =  true,  while  they  have  the  following 
activities:  Act (no.rod)  =  6  =  ^  +  50,  Act(rodi)  =  0  =  ^  —  56,  Aci[rod2)  —  6  — 
^  —  60  and  Act (s/iutdou^n)  =  0  =  0.  A  parameter  p  can  be  considered  as  a  data 
variable  with  differential  equation  p  =  0.  CONTROLLER  is  a  non-linear  hybrid 
automaton,  while  both  ROD,-  are  linear. 

Initial  Condition.  An  initial  condition  0(x)  is  a  convex  linear  constraint  (a  convex 
linear  formula)  over  x  imposed  on  the  initial  location.  It  describes  the  set  of  all 
possible  initial  values  for  x  when  the  system  starts  at  the  initial  location.  In  ROD,- 
and  CONTROLLER  x,-  =  T  and  6  =  6m  are  their  corresponding  initial  conditions. 

1.2  CLP  (7?-)  Representation 

Besides  the  hybrid  automata  representation  hybrid  systems  can  also  be  modeled  as 
CLP (7^)  programs.  Locations  are  interpreted  as  predicates  over  the  data  variables 
X  =  jcjime,  where  time  e  is  the  non-negative  time  variable  saving  the  time 
having  elapsed  since  the  system  started.  Discrete  relations,  invariants,  activities 
and  initial  conditions  are  CLP (72.)  constraints.  A  hybrid  system  H  consists  of  an 
initial  fact  and  a  set  of  discrete  and  analog  rules.  We  now  write  for  Example  1  the 
corresponding  CLP (7^)  program.  For  the  initial  conditions  x,-  =  T  and  6  =  Om  o( 
the  initial  locations  ’ouf,-’  and  ’rjo_rod’  we  set  the  initial  facts  as  follows: 

%  Initial  fact  for  ROD,  %  Initial  fact  for  CONTROLLER 

outi{xi,time)  <-  x,-  =  T,  time  =  0.  no  .rod  [6,  time)  ^  6  =  6m,  time  —  0. 

In  the  body  of  each  initial  fact  we  write  the  initial  condition  as  a  conjunction  of 
convexed  linear  constraints.  Each  system  starts  at  time  0.  For  each  transition  with 
corresponding  discrete  relation  we  set  a  discrete  rule  (cf  Equation  (1)): 

%  Discrete  rules  for  ROD, 

out,-(x,-,f2me)  •<—  X,-  >  T,  WdJ,  xj  =  x,-,  m,-(x{, time). 

?'7i,(x,-,ttme)  <—  'rernove'^,  xj  =  0,  out,(xJ-, tfme). 

%  Discrete  rules  for  CONTROLLER 
no.rod(6^time)  6~6m,  'add'^,  6’  —  6^  rod, (0', time). 
rodi[6,time)  6~6m,  'remove^^  6'  =  6,  no.rod{6' ^time). 
no.rod(6,iime)  <-  6  =  6m,  V,  6'  =  6,  shutdown{6\time). 

Time  does  not  progress  in  discrete  rules.  Empty  synchronization  symbols  and  iden¬ 
tity  (copy)  functions  can  be  omitted.  Moreover,  if  a  prime  variable  x'  is  assigned  a 
(linear)  term  t,  in  the  target  location  V  x  can  be  substituted  for  t  and  the  assign¬ 
ment  x'  =  t  can  be  discarded.  Nonprime  (prime)  data  variables  x  (x')  in  the  source 
(target)  location  stand  for  any  values  in  that  location,  x  (x')  variables  stand  for 
any  values  of  x  (x')  before  (after)  the  discrete  rule  has  been  applied.  If  we  want  to 
express  a  parameterized  hybrid  system^  all  parameters  have  to  occur  in  each  atom 
in  the  same  position,  such  that  they  can  always  be  copied  by  each  rule. 
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Now,  consider  the  differential  (in)equation  for  the  data  variable  Xi  in 

the  location  1.  We  assume  f{xi)  is  integrable  on  the  closed  intervall  [0,(5/]^,  where 
5i  =  time*  —  time  >  0  is  the  delay  of  the  location  /.  Then 

^i{t)  -  ^1(0)#  f  for  t  e  [0,^z]  (2) 

Jo 

computes  the  value  of  the  prime  variable  x'^  when  t  ^  [0,<Jz]  time  units  have  passed 
since  the  system  entered  the  location  L  It  is  x^  (0)  =  Xi  and  for  non-linear  hybrid 
systems  with  f{xi)  =  ci  *  C2,  ci,C2  G  M  with  ci  /  0,  (2)  yields  thereby: 

x'i(t)  -  I,.#  -  Hi  +  ^  *  for  t  e  [0,<S(]  .  (3) 

Cl  Cl 


The  linear  case  is  trivial.  A  location  I  with  invariant  Inv(/),  activity  Act{/)  = 
f\i^j(xiif^f{xi))  and  delay  5i  generates  an  analog  relation  x*)  C  (M”  x  IR+)^: 

I3i{x,x')  =  Inv(/)(x')  A<Si  >  0  A  A  (a;;(<)#ii  +  f  f{xi)dt)  for  t  €  [0,<5/]  .  (4) 

is/  -'<> 


By  (3)  and  (4)  we  set  for  ROD,-  and  CONTROLLER  the  following  analog  rules: 
%  Analog  rules  for  RODf 

outi{xi,time)  4-  time  <  time',  ajJ  =  ar,-  +  {time'  —  time),  out i{x'^,  time'), 
ini(xi,time)  -f-  time  <  time' ,  arj  =  a:,,  ini(x'^,time'). 


%  Analog  rules  for  CONTROLLER 
rodi{0,time)  4—  time  <  time', 

0'  —  0  ~  560  *  exp((tzme'  —  time)/l0)  +  560,  rodi(9' ,time'). 
no-rod{0,time)  0'  <  9m  ^  time  <  time' , 

9'  —  9  ~  500  ♦  exp((t2me'  —  tfme)/10)  +  500,  nojrod{9' ,time'). 
rod2{9,time)  ^  9m  "£  <  time', 

9'  —  9  —  600  *  exp((h‘me^  —  time)/ 10)  +  600,  rod2(9',time'). 
shutdown{9,time)  4-  time  <time',  shut down{9,  time'). 

In  analog  rules  time  progresses  since  time  <  time',  time'  —  time  >  0  is  the  de¬ 
lay.  To  summarize,  we  set  following  proposition  (For  the  sake  of  simplicity  we  set 

x')Atime'  =  time  and  consequently  l'(x')  =  l'{x' ,tim.e).). 

Proposition!.  A  hybrid  system  H  consists  of  one  fact  and  a  set  of  rules: 

{/o(£)  4-  U  (J  {/(£)  4-  V,a(z, a, 

u(J{/(i)4-  0i(x,x'),l{x').}  . 

i 


^  This  also  applies  to  piecewise-continuous  functions. 
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Product  of  Hybrid  Systems.  Let  Hi,  for  z  =  1,  2,  be  two  hybrid  systems  over  the 
common  set  of  data  variables  x.  Both  hybrid  systems  synchronize  on  the  common 
set  of  synchronization  symbols  Ti  P  T2.  The  product  of  Hi  and  H2  is  the  hybrid 
system  H  =  Hi  x  H2  consisting  of  the  discrete  rules: 

li{x)  <-  V,  Q(/i, £  Hi  for  j  =  1,2  , 


(li,h)(x)  <-  'ai,a((„a„/;)(x,5'),  (l'i,l2)(5')-  £  H  and 
(luhHx)  4-  fo4,Q(,„,3,,-)(x,i'),  (h,!2H£')- 
li(x)  ^  X  ),lj(x  ).  6  Hi 

for  i  =  1, 2,  fl]  6  Ti  \  T2  and  02  £  Tj  \  Tj  , 
and  of  the  analog  rules: 

(/i,/2)(x)^  A.(x,x'),/?,,(x,i'),(li,l2)(x').6Hiff 
h{x)  i-  A.(i,x'),/.(x').  €  Hi  for  t  =  1,2  . 

For  the  initial  fact  we  have: 

(/oi,^02)(^)  02(^)-  is  the  initial  rule  of  H  iff 

loi{x)  ^  <f>i(x).  E  Hi  for  z  =  1,2  . 


A  data  state  for  the  data  variables  x  (x'  resp)  is  a  function  ca  that  assigns  a 
real-value  w{xi)  E  M  to  each  data  variable  x,-.  The  behavior  of  a  hybrid  system 
in  each  time  instant  is  described  through  data  states  tu  and  locations  /.  The  pair 
<7  =<  ru  >  is  a  state.  Intuitively,  a  discrete  rule  d(l,  a,  V)  is  a  brief  description  of  the 
discrete  successor  states  <  /',  cj'  >  reachable  from  the  state  <  I,  co  >  at  time  point 
time.  An  analog  rule  a{l)  is  a  concise  description  of  all  possible  analog  successor 
states  <  >  reachable  from  the  state  <  l./ao  >  since  (time'  —  time  >  0)  time 

units  have  passed  after  the  last  entry  from  the  system  in  the  location  /  at  the  time 
point  time.  A  possible  behavior  of  a  hybrid  system  H  is  described  by  a  trajectory. 

Trajectory.  A  trajectory  p  is  a  finite  or  infinite  sequence 

p:<lo,  ^0  /i,  zoi  I2,  ^2  . . .  (5) 

of  states  <  >  and  time  points  iime\  such  that  for  all  z  >  0  for  all  time'  E 

with  fzmej-  <  time'  <  fzzneJ^j,  and  data  state  zu',  Pi.(zui,time'^,zu',time')  is  true, 

<  li,  zj'  >  is  an  analog  successor  state  of  <  tz?,  >,  and  <  /,+i,  >  is  a  discrete 

successor  state  of  <  li,za'  >. 

The  position  tt  of  a  trajectory  p  is  a  pair  tt  =  (z,  r),  where  i  E  N  and  r  E  with 
r  <  Si  i  is  the  z-th  place  in  p  and  r  the  time  of  an  analog  successor  state  <  cczj  > 
of  <  zji  >  before  the  z -I-  1-th  state  <  /,+!,  zoi^i  >  in  p.  The  relation  ’<’  between 
two  positions  tt  and  tt'  is  defined  as  follows:  (z,  r)  =  tt  <  zr'  =  (z',  r')  iff  i  <  i'  or  z  =  i' 
and  r  <  r'.  The  state  at  position  (z,  r)  is  sapp(i,  r)  =<  zo'^  >.  A  region  is  a  pair 

<  >  with  location  /  and  formula  ^  over  x.  It  is  <  I,  ^  >—  {<  /,  ru  >  |  ^[x/zo]}. 
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Symbolic  Trajectory.  A  symbolic  trajectory  p  is  a  finite  or  infinite  sequence 

p  :  <  /o,  ^0  >^<  lu  l2,^2>^  •••  (6) 

of  regions  <  >  such  that  for  all  i  >  0  <  /,+i,  >  6  <  h+ii^i+i  >  iff  3 

<  Wi  >,  <  zzf'i  >  e  <  li,  >  such  that  <  /*,  cj-  >  is  an  analog  successor  state 
of  <  zoi  >  and  <  /,+i,  tUj+i  >  is  a  discrete  successor  state  of  <  /,,  >. 

The  number  of  applied  discrete  rules  (transition  steps)  in  a  (symbolic)  trajectory 
p  is  called  the  length  of  p.  The  duration  6p  of  a  trajectory  p  is  the  sum  ^ 

trajectory  is  divergent  if  Sp  =  oo.  Clearly,  a  symbolic  trajectory  (6)  represents  a  set 
of  trajectories  of  the  form  (5)  where  <  /,  ,  cc?,-  >  G  <  >  for  all  i  >  0.  Besides, 

every  trajectory  (5)  is  represented  by  some  symbolic  trajectory  of  the  form  (6). 

Nonzeno  Hybrid  System.  A  state  <  /,  ci7  >  is  admissible  if  Inv(/)([x/ir])  is  true. 
The  hybrid  system  H  is  nonzeno  if  for  each  admissible  state  <1^zj  >  there  exists 
a  divergent  trajectory  p  oi  H  which  begins  at  <  /,  cc?  >,  i.e.,  sap^(0, 0)  =<  /,  >. 

Intuitively,  H  is  nonzeno  iff  every  finite  prefix  of  a  trajectory  is  a  prefix  of  a  divergent 
trajectory.  In  the  following  we  only  consider  nonzeno  hybrid  systems. 


1.3  Integrator  Computation  Tree  Logic,  ICTL 


We  specify  safety,  liveness,  time-bounded  and  duration  requirements  of  hybrid  sys¬ 
tems  in  ICTL  [1].  Let  H  be  a  hybrid  system  with  data  variables  x  and  set  of  locations 
X,  and  let  z  =  2i, . . .,  Zni  he  a  vector  of  non-negative  real- valued  variables  Zi  E 
called  integrators.  An  integrator  is  a  clock  which  continues  only  in  a  subset  I  of  L. 
I  is  called  the  type  of  z.  A  z-extended  data  predicate  of  is  a  formula  over  x[+)z. 
A  z-extended  state  predicate  of  iX  is  a  collection  of  z-extended  data  predicates,  one 
for  each  location  in  L.  The  formulae  of  ICTL  are  defined  by  the  following  grammar: 

ip  ::=  \  \  ipi  y  ip2  \  <P2  \  VZ/  P2\(z  :  I).ip  ,  (7) 


where  V’  is  a  z-extended  state  predicate  that  contains  only  integer  constants,  z  is 
an  integrator  from  z,  /  C  X  is  the  type  of  z^  and  (z  :  I).(p  is  an  integrator  reset 
quantifier.  The  ICTL  formula  p  is  closed  if  every  occurrence  of  an  integrator  in 
p  is  bound  by  a  integrator  reset  quantifier.  It  is  assumed  that  different  integrator 
reset  quantifiers  in  p  bind  different  integrators.  We  construct  the  hybrid  system 
by  extending  H  with  integrators  z  such  that  for  the  interpretation  of  the  formula 
p  in  Xfz  there  is  in  H  a  corresponding  interpretation.  For  each  integrator  zi  and 
each  location  I  G  X  the  analog  relation  of  I  has  the  form  (We  set  y  =  x,  z  and 
y  =  X,  z,  time): 


Zi  Si 
Zi 


if/GX, 

otherwise 


(8) 


By  (8)  it  is  clear  that  the  analog  rule  a{l)  in  is  the  analog  rule  of  H  extended 
with  z,  i.e.,  a(/)  =  l(y)  ir-  fil(y,^)J{y').  For  the  discrete  rule  d{l,a,V)  of  XT*  we 
get:  d(/,a,/')  =  l{y)  V,  j/'),  F(j^).  with 


m 

1  =  1 
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For  a  set  i?  of  trajectories  in  H  the  z -extension  consists  of  all  z -extended 
trajectories  of  the  x-projections  of  which  are  in  Q.  For  a  state  cr  of  if*  and  set 
Q  of  trajectories  in  H  the  satisfaction  relation  a  \=f2  ^  is  defined  inductively  on  the 
subformulae  of  ^  (cf  Equation  (7)). 


cr\=a 
a  [=.Q  -xp 
0-^0  Pi  V  P2 
(T  \=f2  P2 


(T  \=^  Pi  \jU  P2 


<T]p^n(zi  I).p 


iff  O’  G  M  (M  defines  the  region  >)  • 

m  (t'^P  . 

iff  cr  t=/7  Pi  or  <T  <^2  . 

iff  for  some  trajectory  p  Q  with  sapp(0^  0)  =  cr  there  is 
a  position  tt  of  p  such  that  sapp(7r)  92^  and  for  all 
positions  tt'  of  p,  if  tt'  <  tt  then  sapp(7T')  |=:j7  piV  p2  . 
iff  for  all  trajectories  p  ^  with  sopp(0,0)  =  a  there  is 
a  position  tt  of  p  such  that  sapp{TT)  p2^  and  for  all 
positions  tt'  of  p,  if  tt'  <  tt  then  sapp['K’)  \=a  piW  P2  • 
iff  cr[z/0]  [=r?  ^  . 


Example  2  Reactor  Temperature  Control  System,  The  hybrid  system  H  for  Exam¬ 
ple  1  is  the  product  H  =  CONTROLLER  x  RODjX  ROD2.  In  the  ICTL  spec¬ 
ification  language  the  safety  requirement  for  the  temperature  control  system:  ’A 
shutdown  of  the  reactor  never  occurs’  is  specified  as: 

^initial  ^  VQ  ~~*pfinal  i  (9) 

where  <f>initiai  =  ^  =  {no-rod,  oufi,  out2)  A9  =  9m  A  Xi  —  T  A  X2  —  T  and  p  final  = 
£  =  {no-rod^  outi,  out2)  A  ^  =  9m  A  ri  <  T  A  X2  <  T.  H  starts  at  the  initial 
location  {no.rod,  outi,  out 2)  with  9  =  9m  and  jji  =  T  A  2:2  =  T.  To  meet  the  safety 
requirement  (9)  H  must  not  reach  a  state  <  (no.rod^  outi^  out2).  [9,  xi,  X2]  >  with 
9  =  9m  and  no  control  rod  available  because  ri  <  T  and  X2  <T. 


2  A  CLP  (7^)  System  for  Hybrid  Systems 

Firstly,  we  describe  the  general  structure  R  and  show  some  special  cases  for  R, 
Sect.  2.1.  Secondly,  the  equivalent  semantics  of  CLP(77)  are  considered  and  some 
results  for  hybrid  systems  are  lifted  from  the  CLP  theorie  surveyed  in  [3],  Sect.  2.2. 


2.1  The  CLP{R)  Structure 

The  structure  R'^  =  (X*,  R,  £,  T)  defines  the  underlying  domain  of  discourse  R  and 
the  operations  and  relations  on  it,  where  X  =  (M,  {0, 1},  {4*, exp},  {=,  <,  <}  UT).  If 
we  omit  exp  in  the  definition  of  R,  we  write  77/, n*  Thus,  linear  (non-linear)  hybrid 
systems  are  CLP(77/,„)  (CLP(77))  programs. 
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2.2  Semantics  of  Hybrid  Systems 

Logical  Semantics.  The  two  logical  semantics  of  CLP(72.)  programs  applied  to 
hybrid  systems  lead  to  the  following  proposition. 

Proposition 2.  For  a  hybrid  system  H  the  E -theory,  also  denoted  by  H ,  and  the 
Clark-Completion  H*  are  the  sets  (cf  Theorem  1): 

={Vx.((^(x)  ;o(5))}  u  U  {Vx,x'.(a(,,a,,-)(x,x')  A/'(x')  ;(£))}  U 

(J  {Vx,x'.(A(x,x')  A/(x')  ^  /(x))}  ,  (10) 

a{t)€H 

ff*={Vx.(/o(x)  ^  3x'.?i(x')V  V  3x'.(a(,„,„,,,)(x,x')A/'(x')) 

dilo,a,V)eH 

V3x\(i3i^{x,x')  Alo{x')))]U  y  {Vx.(/(x) 

l^L\{lo} 

y  3x\{a^i^a,i>){^^x)  Al'(x'))V3x\{l3i(x,x')  Al(x')))]  .  (11) 

d{l,a,l>)eH 

Fixpoint  Semantics.  The  fixpoint  semantics  is  based  on  two  one-step  functions 
and  The  closure  operator  generated  by  is  denoted  by  [[if].  and  [[fi| 
map  over  R-interpretations.  The  set  of  R-interpretations  forms  a  complete  lattice 
under  *  Both  and  fif]  are  continuous  on  Sti  —  {/(ro)  |/GT, 

For  a  set  of  facts  X  denotes  [X]ti  the  set  {v{l)  \  (/  ^  c.)  G  X,  R  f=  u(c)}. 

Tff(X)  =  {/(cc?)  I  /(x)  ^  c,V.  G  fi,  a  G  X,  u  is  a  valuation  on  R  such  that 

R  \=z  i;(c),  v{x)  —  TU  and  u(/')  =  a}  .  (12) 

Sff  is  defined  on  sets  of  facts,  which  form  a  complete  lattice  under  '  C'. 

55(A')  =  {l{x)  i-  c.  I  l(x)  <-  d ,  V.  e  H,  a  <-  c".  G  X  renamed  to 

new  variables  and  R  [=  (c  d  Ad'  A  I'  =  a)}  .  (13) 

The  closure  operator  generated  by  is  denoted  by  Both  and  are 

continuous.  Between  and  there  exists  the  relation:  [5^(A')]7^  =  Tff{[X]Ti)- 

Propositions.  For  hybrid  systems  H,  Hi,  H2  and  set  of  facts  Q  over  the  con¬ 
straint  domain  'R  with  corresponding  E-theory  T  the  following  holds: 

=  lfp(T^}  =  [lfp(S^)U  =  ![/fl(0)  . 

tm{H,  R)  =  [{;<-  c  I  H*,  R  1=  (c  =>  =  [{l<r-c\II*,T^{c=f>- 1)}]^  . 

^  Our  results  also  hold  for  a  constraint  domain  isomorph  to  R. 
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3.  Im(H*,K)  ImiH.K)  =  lfp{Ti)  . 

I  gm(H*,K)  =  gfp{T^)  . 

5.  [lHl([Qk)  =  lI/fU(31(0)  =  /m(iyuC?,R)  . 

6.  «fl-»  (Q)  ^<H  U  Q>  (0)  =  lfp{S'§^Q )  . 

7.  R  1=  {Hi  4=4>  H2)  ifflHil  =  M  • 

Top-Down  Semantics.  The  operational  semantics  is  given  as  a  transition  system 
of  CLP  (7^)  states  together  with  a  computation  rule.  The  computation  rule  consists 
—  starting  with  the  initial  fact  —  of  an  ’alternating  selection  of  analog  and 
discrete  rules  in  a  top-down  left-to-right  Prolog  style’,  top-down  for 
the  selection  of  an  applicable  rule  and  left-to-right  for  the  subgoal  selection.  We  give 
a  specific  transition  system  for  hybrid  systems  that  computes  a  symbolic  trajectory 
of  length  m  by  calling  the  hybrid  system  with  the  goal  trajectory[[Lo,  Li, . . . ,  Lm])- 
Li  are  location  metavariables  running  on  the  set  L. 

Transition  System.  The  transition  sj^stem  consists  of  the  transition  rules 
— s-ud  — such  that  I 

•  <  irajectory[Lo,  •  • . ,  L^]),  C,  S  >~^init 

<  ^[x)mlQ{x)*trajectory{[Li  , . . . ,  Tm]),  S  y  , 

if  Lq  is  a  location  metavariable,  /o(r)  <l){x).  is  the  initial  fact  renamed  to  new 

variables,  and  both  Lq  and  Iq  are  the  same.  Meaning:  Symbolic  trajectories  begin 
in  the  initial  location, 

•  <  trajectory{[Lo, . . . ,  Lm]),  C,  5  >^init  fail  , 

if  Lq  is  a  location  metavariable,  and  for  the  initial  fact  lo(x)  4>{x).  Lq  and  Iq 
are  different.  Meaning:  Only  symbolic  trajectories  starting  in  the  initial  location 
may  be  accepted. 

•  <  li(x)  •trajectory{[Li+i,...,Lm]),C,S  >^a 

<  x')  mhix')  •trajectory{[Li+i, . . . ,  L^]),  C,  5  >  , 

if/,-  is  a  location,  l{x)  ^  A(r,  r'),  l(x').  an  analog  rule  renamed  to  new  variables, 
and  both  /  and  U  are  the  same.  Meaning:  Application  of  an  analog  rule  means 
firstly  to  select  the  analog  rule  and  secondly  to  solve  the  constraints  (3i^(x^x'). 

•  <  li{x)  •  trajectory{[Li,^i, . . . ,  Lm]),  C,  5  >-^a  fail  , 

if  li  is  a  location,  and  for  each  analog  rule  l{x)  /3i(x^  x')J{x').  I  and  /f  are 
different.  Meaning:  There  must  be  at  most  one  applicable  analog  rule  for  each 
location. 

•  <  li{x)  •trajectory(W),C,S  >-^a<  >  , 

if  li  is  a  location,  l{x)  <r-  fii(x,i  x')J{x').  an  analog  rule  renamed  to  new  vari¬ 
ables,  and  both  /  and  U  are  the  same.  Meaning:  The  computation  ends  by  the 
application  of  the  analog  rule  corresponding  to  the  last  location. 

•  <  li{x)  •  trajectory{[Li+i, . . . ,  Lm]),  C,  S  >-^d 

<  •r(x')  •  trajectory{[Li+2,  •  • ' ,  Lm]),  C,  S  >  , 

if  li  is  a  location,  /(x)  -f-  is  a  discrete  rule  renamed  to 

new  variables,  and  I  and  I'  and  Xi+i  are  the  same,  respectively.  Meaning: 
Application  of  a  discrete  rule  means  firstly  to  select  an  applicable  discrete  rule 
and  secondly  to  solve  the  constraints  ^0- 
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•  <  li{x)  •trajectory  {[Li+i^  •••  t  Lm])iC^  S  >~~^d  » 

if/,-  is  a  location,  and  for  each  discrete  rule  l(x)  f-  I  and  /, 

are  different.  Meaning:  If  there  is  no  applicable  discrete  rule  as  required  for  the 
computation,  then  the  computation  fails. 

The  transition  rules  and  ^5  are  defined  as  usual.  The  transition  system 

can  easily  be  extended  to  handle  more  general  goals  like  G  =  c(x)  A  Lm  (^)  A 
trajectory{[Lo,...,Lm])  or  G  =  c(x)  A  Lm(x)  expressing  the  reachability  of  the 
property  (state  predicate)  q:  =  c{x)  A  Lm{i)  in  the  symbolic  trajectory  through 
Xo, .  • . ,  The  exhaustive  search  method  of  CLP(7^)  can  be  prohibited  for  certain 
properties  and  hybrid  systems.  Thus,  to  analyse  hybrid  systems  we  propose  in  Sect. 
3  the  use  of  some  evaluation  techniques  tailored  to  hybrid  systems. 

Example  3.  Consider  the  reactor  temperature  control  system  from  Example  1  and 
2.  irajectory{[{no-rod,  ouii,  out2),  {rodi,  in\,  out2)’,  (nojrod,  owf  1,  out2)])  computes 
the  symbolic  trajectory  <  (no-rod,outi,out2),^o  >^<  (rodi,ini,out2),^i 
<  (no.rod,outi,out2),^2  >  with  constraints  iP,*.  To  compute  a  set  of  symbolic  tra¬ 
jectories  we  call,  for  instance,  the  goal  trajectory([{no.rod,outi,  out2)^  Xi,  X2,  X3, 
(no-rod,  outi,  out 2)])  which  will  by  backtracking  compute  a  set  of  symbolic  trajecto¬ 
ries  with  Xi,  X3  €  {{rodi,  ini,  out2),{rod2,  out  1,  in2)]  e^nd  L2  =  {nojrod,outi,out2)^ 
To  check  whether  the  prohibited  state  (no.rod,  outi,  out2)A9  =  <  TAx2  <  T 

can  be  reached  by  some  (symbolic)  trajectory  of  length  m  w'e  could  call  0  =  Om  A 
xi  <  TAx2  <  TA(no.rod,outuout2)(0,xi,X2)Atrajectory{[{no.rod,outi,out2),Li, 

. . . ,  Xm-i,  (no_rod,  outi,  out 2)])  which  will  fail  for  every  m  >  0. 

Proposition  4.  The  computation  rule  mentioned  above  is  a  fair  computation  rule. 

Theorems  Soundness  and  Completeness.  For  a  hybrid  system  H  the  goal 
trajectory{[LQ, . .  .,Lm])  location  metavariables  Li  (i  =  0, . . .  m  >  0)  is 
successful  in  a  CLP('Jl)  system  with  answer  constraint  {co(xo,  Xq),  . . . ,  Cm[xm,  ^’m)) 
iff  the  sequence  of  regions  <  Lo,Wo{xo,Xq)  >»->■  •••  i->< 

R  1=  {ci{xi,  xj)  ^i{xi,  x\))  for  a//  e  =  0, . . . ,  m  is  a  symbolic  trajectory  of  H. 

Now,  for  a  hybrid  system  H  we  consider  the  success  set  SS(iX)  = 
which  collects  the  answer  constraints  of  all  finite  and  infinite  symbolic  trajectories. 
SS(Xf )”"  is  the  set  {lo(x)  <r-  cq.,  ...,lm(i)<r-  Cm-}  of  facts  li{x)  <r-  a.  which  are  built 
of  the  answer  constraint  {co(^o,  ^0)^  •  •  •  ^  Cm{^m,im)}  of  corresponding  symbolic 
trajectory  of  length  m.  The  finite  failure  set  FFS(Xf)  =  IJ^q  FTS(iX)*  collects  the 
set  of  finite  failed  goals  of  the  form  c(x)Alm  {x)Atrajectory{[Lo, .  • . ,  /m])-  h  describes 
the  set  of  all  (symbolic)  trajectories  in  a  hybrid  system  which  finitely  fail. 

FFS(Xf)"^  =  {/m(^)  ^  I  for  every  fair  derivation 

<  lm{x)  Atrajectory([Lo^---,Lm-iJ]),{c{x)},^  fail}  .  (14) 

Proposition6.  Let  CLP(V)  be  an  ideal  system  where  V  and  T  correspond  on  C. 
Then: 
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1.  SS[H)  =  IfpiS’^)  and  [55(/f)]p  =  /m(i/,D). 

2.  If  the  goal  G  has  a  successful  derivation  with  answer  constraint  c,  then  H.T 
[c^G). 

3.  Now,  suppose  T  is  satisfaction  complete  wrt  C.  If  G  has  a  finite  computation 
tree,  with  answer  constraints  Cj, . . . ,  then,  H*,  T  \=  (G  Ci  V  -  •  •  V  c^). 

4-  If  H,T  {c  G)  then  there  are  derivations  for  the  goal  G  with  answer  con¬ 
straints  Cl,...,  Cm  such  that  T  |=  (c  U,  in  addition,  (V,C)  has 

independence  of  negated  constraints  property,  then  the  result  holds  for  m  =  1. 

5.  Now,  suppose  T  is  satisfaction  complete  wrt  C.  IfH*,T  \=  {G  <=>  Ci  V  •  •  •  V 
Cm),  then  G  has  a  computation  tree  with  answer  constraints  ci  V  •  •  •  V  such 
that  T  (=  (ci  V  •  • '  V  c„,  ci  V  •  •  •  V  c;„). 

6.  Now,  suppose  T  is  satisfaction  complete  wrt  C.  Then  the  goal  G  is  finitely  failed 
forHiffH\T^~.G. 

7.  Suppose  {V,C)  is  solution  compact.  Then  T^  I  u  =z  Bv\  lFFS{H)]x>. 

Bottom-Up  Semantics  The  bottom-up  semantics  starts  with  a  set  of  facts  and 
computes  step  by  step  —  supposing  the  iteration  terminates  —  a  representation  of 
the  least  model  of  H,  lm{H,K).  The  bottom-up  execution  of  a  hybrid  system  is 
defined  as  a  transition  system  between  sets  of  facts.  For  each  rule  l{x)  <-  c,  V.  e  H 
and  sets  of  facts  A,  B  the  relation  A'^  B  is  defined  as  follows: 

A  B  B  =  A\J  {l{x)  F .  \  l[x)  ^  c,  V .  E  H  and  there  exists  a  fact 

/i (£)««- Cl.  E  A  with  R  (c'  <=>  cAciA/i=/')}  .  (15) 

By  definition  of  the  operator  S'§  in  (13)  it  is  clear  that  A  BxfiB  =  AU 

6  H. 

Execution.  An  execution  is  a  sequence  of  transitions  of  the  form  (15).  An  execution 
is  fair,  if  it  can  be  applied  infinitely  often.  The  execution  Aq  Ai  Aj 

. . .  terminates,  if  there  exists  a  m  and  for  each  k  >  m  Ak  =  Am  is  true.  For  a 
hybrid  system  H  we  say:  H  is  finitely  computable,  if  for  each  finite  initial  set  Aq 

of  facts  and  for  each  fair  execution  there  is  a  m  such  that  =  [Am]n  for  all 

k  >  m.  An  execution  can  be  non-terminating  even  for  finitely  computable  hybrid 
systems  and  finite  initial  set  Aq. 

Theorem  7.  Let  H  be  a  hybrid  system.,  Q  a  set  of  facts  and  A  the  result  of  a  fair 
bottom-up  execution.  Then  A  =  SS{H\JQ)  (Q)  and  EFfI([<3]7^)  =  [A]7^. 

Example  4.  To  verify  the  safety  requirement  (9)  of  the  reactor  temperature  control 
system  we  can  show  by  bottom-up  evaluation  that  if  the  execution  (starting  with 
<r  Sinai  =£  =  {no.rod,  outi,  out2)  AO  ~  A  Xi  <  T  A  X2  <T) 

{{no^rod,outi,out2)(0,  xi,  X2)  0  -  Om  A  Xi  <T  A  X2  <T.]  ^  A 

finishes  with  the  set  of  facts  A,  then  for  A  the  initial  goal  (corresponding  to  the 
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set  of  initial  states  <i>initiai  =  t  ~  (noj'od^  out\^  out2)  AO  —  6m  ^  —  T  A  X2  —  T) 

G  =  6  =  0m/^^i  =  TAx2=TA  {no-rod,  outi,oui2){6,  xi,  X2)-}  fails. 

The  top-down  and  bottom-up  evaluation  methods  require  a  method  to  check  if 
the  fixpoint  is  reached.  Srivastava  gave  in  [5]  several  algorithmus  for  subsumption 
check  in  CLP(7^h•n).  These  are  considered  in  Sect.  3.7,  but  in  more  details  in  [8]. 


3  Evaluation  Techniques  for  Hybrid  Systems 

In  this  section  we  present  several  verification  techniques  for  hybrid  systems.  The 
most  important  of  them  are  the  reachability  analysis  Sect.  3.1,  the  proof  of  safety 
requirements  Sect.  3.2,  the  proof  of  duration  properties  Sect.  3.3,  the  parameterized 
reachability  analysis  Sect.  3.4,  and  the  symbolic  model-checking  for  ICTL  formulae 
Sect.  3.5.  The  delay  of  constraints  strategy  Sect.  3.6,  the  subsumption  and  indexing 
of  contraints  [5]  Sect.  3.7,  and  the  intelligent  backtracking  strategy  [2]  Sect.  3.8, 
however,  might  be  useful  for  improving  the  efficiency  of  the  proof  methods. 


3.1  Reachability  Analysis 

The  problem  to  check  whether  a  property  (p  represented  as  a  z-extended  state  predi¬ 
cate  is  reachable  in  a  given  hybrid  system  H  can  be  solved  by  top— down/bottom-up 
fixpoint  computation.  Let  <p  final  =  U/ex  <  ^  >  be  a,  final  region  describing  a  set 

of  states.  As  every  is  a  convex  linear  formula  (pjinai  can  be  rewritten  as  a  final 

goal  G final  =  ^  Ky))  or  as  a  set  of  final  facts  Ajinai  =  ^ 

Correspondingly,  for  the  initial  region  ^initial  we  get  the  initial  goal  Ginitiai' 


Propositions.  1.  IfSS{H)  =  lfp[S'§)  terminates  and  %  is  satisfaction  complete, 
then  (p  final  Is  in  H  reachable  iff  G  final  is  in  SS{H)  successful. 

2.  Let  A  be  the  result  of  the  fair  bottom-up  execution  Afinai  A,  then 

^ final  is  in  H  reachable  iff  G initial  is  in  A  successful. 

3.  If  SS[H  Afinai)  =  ifPi^HuAfinai)  and  Tl  is  satisfaction  complete. 

Then  p final  is  in  H  reachable  iff  G initial  is  in  SS[H  Afinai)  successful. 


Proof.  1.  By  Definition  ofSS{H)  and  since  p final  reachable  in  H  there  exists 

an  m  >  0  such  that  for  SS(i/)’^  =  {^o(y)  ^  cq.,  . .  .,lm{y)  <—  Cm-}  the  final  foal 
Gfinai  =  ^Ky))  is  successful.  Since  Gfinai  =  in 

SS(JT)  successful,  then  there  exists  a  computation  tree  from  Gfinai  with  answer 
constraints  ci,...,C5,  s  =  1,...|L|  with  R  (ci,...,C5  4=>  Al{y))). 

Consequently,  p final  is  in  H  reachable. 

2.  and  3.  are  consequences  from  1.  and  Theorem  7.  □ 
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3.2  Proof  of  Safety  Requirements 

To  prove  the  safety  requirement  ^pinitiai  =>  Vn  it  is  transformed  into  a 

reachability  problem.  Then  Proposition  8  can  be  applied. 

Proposition  9.  Let  H  be  a  hybrid  system.  H  meets  the  safety  requirement  (finitiai  => 
Vn  -xp final  iff  final  is  in  H  unreachaUe. 

3.3  Proof  of  Duration  Properties 


Duration  properties  can  be  proved  by  reachability  analysis.  This  is  done  by  displac¬ 
ing  the  clocks  and  integrators  occurring  in  an  ICTL  duration  formula  into  the  hybrid 
system  description.  Then  the  duration  property  is  transformed  into  a  reachability 
problem  and  verified  for  the  modified  hybrid  system  as  mentioned  in  Proposition  8. 


Examples.  The  duration  property:  ’Each  control  rod  will  be  used  at  most  1/3  of 
the  allowed  time  T’  can  be  specified  in  ICTL  as  follows  [1]: 


<l>initiai  =>  (z  :  L){zi  :  (_,  mi,_))(2r2  :  m2)).VO  (3zi  <  z  A  3z2  <  ^)  ,  (16) 


where  <i)initiai  =  C=  {no.rod,  outi,  out2)A0  =  OmAxi  <  TAx2  <  T,  z  is  a  clock,  and 
zi,  Z2  are  integrators.  To  verify  this  property  the  verification  problem  is  rewritten 
as  a  reachability  problem.  The  clock  z  proceeds  as  the  global  system  time.  For  z 
the  analog  and  discrete  relations  take  the  form:  ^)  =  ^/(x,  i')  A  z'  —  z  Si 

^[i,aJ')iy^y')  =  ^0  A  z'  =  z,  respectively.  The  integrators  zi  and  Z2 

advance  parallelly  with  the  system  time  only  in  the  set  of  locations  {(_,  mi,  _)}  and 


f  x')  Az[  =  zi-\‘Si 
\^i(x,x') 

f  /di(x,x')  Az^  =  Z2-i-Sj 


m2)}.  Otherwise,  they  do  not  change. 


^'(i,aj')(y^y')  =  I 

^(i,a,i')(y^  y')  =  I 


a(i^a,i')(x,x')  Az'i  =  Zi 

^'(l,a,l')  (^7  ^0 

^')  A  Z2  =  Z2 


if  I  €  {(_,mi,_)} 
otherwise 

if  /  €  m2)} 

otherwise 


if  /  6  {(-,/ni,_)} 
otherwise 

if  /  ^  {(-,  in2)} 
otherwise 


Equation  (16)  yields  thereby  (^initial  =>  V^-'Pfinai  with  pfinai  =  3zi  >  zV3z2  >  z. 
For  this  hybrid  system  it  must  be  checked,  whether  the  unsave  states  [[3zi  >  z  V 
3-2  >  are  reachable.  That  is,  (16)  holds  iff  ^initial  is  unreachable. 


3.4  Parameterized  Reachability  Analysis 


The  top-dow'n  and  bottom-up  fixpoints  can  also  be  used  to  verify  parameterized 
hybrid  systems.  They  offer  a  method  to  compute  necessary  and  sufficient  conditions 
for  the  parameters  under  which  the  reachability  of  a  property  can  be  ensured.  Here, 
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the  corresponding  CLP(7e)  program  description  of  the  hybrid  system  has  to  be 
adequately  adapted.  For  a  vector  p  =  pi,...,?,  of  parameters  the  resulting  p- 
extended  hybrid  system  Hp  is  built  in  that  way  as  but  here  pi  are  parameters  and 
therefore  /3,'(tc,  iV)  =  /3i{x,  P'i  =  P<  “')  =  «'(*>  ^')AA-=i  Pi  =  Pi’ 

where  w  =  x,  p  and  w  time. 


Proposition  10.  Let  ‘Pinitial  ^  VD  ^Piinal  be  o  safety  property,  p  a  vector  of 
parameters,  A/i„ai  the  final  set  of  facts  and  Ajinai  ^  bottom-up 

fixpoint  of  the  parameterized  hybrid  system.  Then  the  p-projection  of  the  answer 
constraint  of  the  initial  goal  Ginitiai  executed  in  A  provides  a  sufficient  and  necessary 
condition  (p  over  p^,  such  that  for  valuations  of  p  which  make  p  true  the  safety 
requirement  is  met  In  other  words,  the  safety  property  does  not  hold  iff  is  true. 


3.5  Symbolic  Model  Checking 

The  symbolic  model-checking  procedure  SMC  of  [1]  verifies  for  an  ICTL  formula  p 
and  a  linear  hybrid  system  H  whether  H  meets  p.  The  SMC  procedure  computes 
for  a  linear  hybrid  system  H  and  an  ICTL  formula  p  the  characteristic  region  Mh 
by  providing  a  state  predicate  V’  which  defines  the  answer,  i.e.,  ^pln-  That  is:  M  = 
IpIh-  The  state  predicate  is  called  a  characteristic  predicate  of  p.  A  characteristic 
predicate  in  general  does  not  exist,  and  it  is  undecidable  if  a  given  state  predicate  is 
a  characteristic  predicate  of  a  given  ICTL  formula  p.  Our  SMC  procedure  computes 
a  set  of  facts  which  indirectly  defines  the  characteristic  predicate  In  the  SMC 
procedure  we  mainly  use  the  bottom-up  semantics  of  hybrid  systems. 


\xP\  :z=  (J  {l(x)  4-  A  Inv(/).},  where  ip  =  [J  <  l^ipi  >  .  (17) 

leL 

:=  kr  • 

ipi\  :=  bilU|vJ2|  • 

\pi  37/  v^2|  :=  U  Xi  where  xo  :=  lv^2|  and  x,+i  :=  Xi  C 
i>0 

with  5^.(1 I  U  |v?2|) 


{pi'^U  p2\  :=  IJ  X*  where  xo  :=  ^2!  and 

Xi+i  :=  IXi  VW<p2.(“^X»  3//  (*^(<^1  V  Xi)  V  vw  ^>2  >  l))l  • 

(21) 

\(z  :  I). p\:—  |<^l[2/0]  (i.e.  replace  all  occurrences  of  2  in  \p\  by  0)  .  (22) 

Proof.  Equation  (17)  :  The  set  of  facts  \ip\  for  the  z-extended  state  predicate  ip  - 
U/€L  <  ^  >  consists  of  the  set  of  admissible  states  that  meet  Equations  (18) 

®  The  prototype  system  CLP{77)  [4]  offers  the  operator  dump  to  project  constraints  onto 
variables. 
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and  (19)  are  clear.  Equation  (20)  :  The  computation  of  the  set  of  facts  |v?i  3U  ^2! 
for  (pi  3U  (p2  mainly  uses  the  bottom-up  execution  Xo'^  Xi^  Xt+i,  which 

is  another  representation  of: 

1^2!  U5i  \p2\y^Bi  UB2  \p2\liB1  U  •■•UBi+i  .  (23) 

If  this  execution  terminates  with  the  set  of  facts  A,  then  A-\pi3U  <p2\.  In  every 
bottom-up  step  new  sets  of  facts  Bi  are  computed.  Each  fact  in  5,  is  a  concise 
description  of  states  contained  in  Xi  that  meet  pi  or  p2>  Equation  (21)  :  This  proof 
is  made  in  two  steps.  In  the  first  step  the  operator  'iU  is  transformed  into  a  timed 
operator  \/ll  <c  with  c  E  In  the  second  step  VZ/  <c  is  converted  into  the  operator 
such  that  the  proof  of  (21)  is  conducted  back  to  the  proof  of  (20).  Equation 
(22)  :  It  expresses  the  fact  that  the  integrator  z  has  to  be  set  to  zero  since  from 
that  moment  onwards  its  value  plays  an  important  role  by  the  verification  of  p.  □ 


3.6  Delay  of  Constraints  Strategy 


The  Delay  of  Constraints  Strategy  (DCS)  is  a  small  extension  of  the  top-down 
and  bottom-up  semantics.  Conceptually,  the  idea  is  as  follows:  After  each  top- 
down/bottom-up  step  a  database  of  successful  calls  together  wuth  their  generated 
answer  constraint  is  mantained.  If  such  call  is  made  later  in  the  computation,  do 
not  re-execute  the  call,  but  use  the  same  answer  constraint  to  update  the  current 
position  as  though  the  call  had  been  executed  and  had  returned  that  constraints. 
Moreover,  if  in  each  step  the  constraint  is  linear,  it  is  solved  directly.  Otherwise,  it 
is  delayed  and  may  get  solved  later,  if  it  becomes  linear  during  the  computation. 
Therefore,  DCS  can  be  used  in  the  verification  methods  mentioned  above. 


3,7  Subsumption  and  Indexing  of  Constraints 


Top-down  and  bottom-up  evaluation  require  in  each  iteration  step  a  fixpoint  check, 
i.e.,  a  check  if  any  new  facts  have  been  computed  in  the  iteration  step.  Due  to  the 
nondeterministic  features  of  hybrid  systems  there  may  be  several  answer  constraints 
computed  for  an  iteration  step.  A  subsumption  check  enhances  efficiency  if  subsumed 
facts  can  be  discarded  in  each  iteration  step.  Thus,  we  propose  to  combine  fixpoint 
check  and  subsumption  check  in  each  iteration  step.  Srivastava  [5]  gave  several  algo¬ 
rithms  to  improve  the  efficiency  of  the  subsumption  check  of  CLP(7^^,„  )  programs. 
The  first  one  is  a  deterministic  algorithm  based  on  the  divide  and  conquer  strategy. 
The  second  one  is  an  indexing  algorithm,  where  only  those  polyhedra  are  indexed 
which  do  not  intersect  with  a  given  polyhedron,  such  that  they  can  be  efficiently 
eliminated.  The  third  algorithm  is  an  incremental  algorithm  which  interleaves  the 
computation  of  a  constraint  with  a  subsumption  check.  For  the  application  of  these 
algorithms  to  linear  hybrid  systems  we  must  refer  to  [8]. 
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3.8  Intelligent  Backtracking  Strategy 

Intuitively,  intelligent  backtracking  is  based  on  the  idea:  During  the  computation 
of  the  computation  tree  the  failure  in  a  branch  is  first  explained  as  due  to  an 
inconsistent  set  of  constraints,  the  conflict  set,  caused  during  the  application  of 
a  discrete  rule.  A  backtrack  point  has  to  be  chosen  in  order  to  remove  at  least 
one  element  of  the  conflict.  Conflicts  are  computed  using  a  Dynamic  Intelligent 
Backtracking  algorithm  (DIB).  DIB  needs  a  constraint  solver  able  to  detect  a  conflict 
when  the  constraint  set  is  unsolvable.  The  unsolvability  of  a  constraint  set  may  be 
explained  by  several  different  conflicts,  de  Backer  and  Beringer  showed  in  [2]  that 
in  CLP(7^/m)  the  computation  of  a  minimal  conflict  has  a  polynomial  complexity. 
The  conflict  set  may  give  reasons  for  the  unsatisfiability  of  a  property  (cf.  [8]). 
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Abstract.  It  has  been  argued  that  the  linear  database  model,  in  which 
semi-linear  sets  are  the  only  geometric  objects,  is  very  suitable  for  most 
spatial  database  applications.  For  querying  linear  databases,  the  lan¬ 
guage  FO  -h  linear  has  been  proposed.  We  present  both  negative  and 
positive  results  regarding  the  expressiveness  of  FO+linear.  First,  we  show 
that  the  dimension  query  is  definable  in  FO  -f  linear,  which  allows  us  to 
solve  several  interesting  queries.  Next,  we  show  the  non-definability  of  a 
whole  class  of  queries  that  are  related  to  sets  not  definable  in  FO-f  linear. 
This  result  both  sharpens  and  generalizes  earlier  results  independently 
found  by  Afrati  et  al.  and  the  present  authors,  and  demonstrates  the 
need  for  more  expressive  linear  query  languages  if  we  want  to  sustain 
the  desirability  of  the  linear  database  model.  In  this  paper,  we  show 
how  FO  +  linear  can  be  strictly  extended  within  FO  -f-  poly  in  a  safe 
way.  Whether  any  of  the  proposed  extensions  is  complete  for  the  linear 
queries  definable  in  FO-f  poly  remains  open.  We  do  show,  however,  that 
it  is  undecidable  whether  an  expression  in  FO  -f  poly  induces  a  linear 
query. 


1  Introduction 

Following  the  seminal  work  by  Kuper,  Kanellakis,  and  Revesz  [20]  on  constraint 
query  languages  with  polynomial  constraints  (FO  poly),  various  researchers 
have  introduced  geometric  database  models  and  query  languages  within  this 
framework  [18,  22] .  These  researchers  have  studied  the  desirability  of  their  mod¬ 
els  for  database  applications  involving  geometric  data  objects,  as  well  as  the 
expressiveness  of  the  proposed  geometric  query  languages. 

An  important  database  model  that  has  recently  been  studied  in  this  context 
is  the  linear  spatial  database  model  [2,  3,  26],  which  we  adopt  in  this  paper. 
The  linear  model  allows  users  to  define  relational  databases,  which  may,  be¬ 
sides  conventional  data,  contain  linear  geometric  data  objects.  Formally,  these 
objects  are  so-called  semi-linear  sets,  which  can  be  defined  in  first-order  logic 
over  the  reals  with  addition.  The  class  of  semi-linear  sets  suflftces  for  the  ma¬ 
jority  of  applications  encountered  in  GIS,  geometric  modeling,  and  spatial  and 
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temporal  databases  [6,  23].  Furthermore,  data  structures  and  algorithms  have 
been  developed  to  elRciently  implement  a  wide  variety  of  operations  on  these 
sets  [4,  11,  17,  24]. 

Associated  with  the  linear  model  is  the  concept  of  linear  query,  which  is  a 
mapping  from  linear  databases  to  linear  databases.  Because  the  linear  database 
model  is  a  sub-model  of  the  polynomial  database  model,  it  is  in  principle  possible 
to  use  the  query  language  FO-|-poly  to  define  natural  linear  queries,  and,  in  fact, 
a  vast  number  of  important  linear  queries  can  indeed  be  so  defined.  Of  course, 
not  every  query  defined  by  an  FO  +  poly  formula  induces  a  linear  query,  and, 
as  is  shown  in  Section  5,  it  is  even  undecidable  whether  an  FO  +  poly  formula 
induces  a  linear  query. 

Faced  with  this  reality,  several  researchers  [3,  26]  have  proposed  the  query 
language  FO-h linear  as  a  natural  query  language  to  accompany  the  linear  model. 
The  FO+linear  language  is  the  sub-language  of  FO-f-poly  wherein  the  polynomial 
constraints  are  restricted  to  linear  constraints.  Many  important  linear  queries 
can  be  defined  in  FO+linear.  Section  3  reviews  some  known  results  in  this  respect 
and  presents  some  new  ones.  The  most  surprising  of  those  is  the  definability  of 
the  dimension  query  which  returns  the  topological  dimension  of  a  semi-linear 
set.  This  definability  result  allows  us  to  solve  some  important  practical  queries. 
In  particular,  it  follows  that  the  interval-query,  i.e.,  “Is  the  semi-linear  set  an 
interval?”  and  the  /me-query,  i.e.,  “Is  the  semi-linear  set  a  line?”  are  definable 
in  FO  -I-  poly  . 

Unfortunately,  FO  -h  linear  is  incomplete  for  the  linear  queries  definable  in 
FO*f  poly  as  was  recently  shown  by  Afrati  et  al.  [2].  The  counter-example  used  by 
Afrati  et  al.,  however,  is  a  technical  one,  and  does  not,  in  our  view,  adequately  re¬ 
veal  the  weaknesses  of  FO  + linear  as  a  language  to  define  linear  queries  definable 
in  FO  -f  poly.  In  Section  4,  building  on  the  work  of  Afrati  et  al.  [2]  and  on  earlier 
work  of  the  present  authors  [26],  we  show  that  natural  FO  + poly-definable  linear 
queries,  such  as  the  query  that  yields  the  convex  hull  of  a  semi-linear  set,  are 
not  FO  +  linear-definable.  The  conclusion  we  draw  from  these  negative  results  is 
that,  though  FO-h  linear  provides  a  good  lower  bound  for  the  FO+ poly-definable 
linear  queries,  FO  -h  linear  is  too  limited  in  expressiveness  to  be  considered  fully 
adequate  to  accompany  the  linear  model. 

This  brings  us  to  the  last  major  topic  of  this  paper.  In  Section  5,  we  introduce 
query  languages  that  can  only  express  FO+poly-definable  linear  queries,  but  that 
are  strictly  more  expressive  than  FO  + linear.  These  languages  have  some  affinity 
with  some  operational  languages  that  have  been  introduced  in  spatial  database 
models,  but  that  do  not  fall  within  the  framework  of  Kuper,  Kanellakis,  and 
Revesz  [20].  It  is  presently  an  open  problem  whether  any  of  the  query  languages 
we  propose  in  Section  5  is  complete  for  the  FO  -h  poly-definable  linear  queries, 
though  we  conjecture  this  is  not  the  case. 
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2  Preliminaries 

In  this  paper  we  focus  on  the  linear  spatial  database  model  as  proposed  in  [26]. 
To  put  this  paper  into  better  perspective  we  briefly  review  some  of  the  material 
in  [26]. 

The  linear  model  is  extracted  from  the  polynomial  model,  which  is  based 
on  real  formulae,  i.e,  formulae  in  first  order  logic  over  (R,  <,+,*,  0, 1).  Due  to 
the  work  of  Tarski  [21],  it  is  well  known  that  this  first  order  logic  over  the  reals 
with  inequality,  addition  and  multiplication  is  a  decidable  theory.  Every  real  for¬ 
mula  ^{xi, . . .  ,Xn)  with  free  real  variables  xi, ...  ,Xn  defines  a  geometrical  figure 
{(xi, . . .  ,Xn)  I  (xi,...,Xn)  €  A  , . . . ,  Xn)}  in  n-dimensional  Euclidean 
space  R”.  Point  sets  defined  in  this  way  are  called  semi- algebraic  sets. 

A  spatial  database  scheme,  S,  is  a  finite  set  of  relation  names.  Each  relation 
name,  R,  has  a  type  which  is  a  pair  of  natural  numbers,  [m,  n],  where  m  denotes 
the  number  of  non-spatial  columns  and  n  the  dimension  of  the  single  spatial  col¬ 
umn  of  R.  A  database  scheme  has  type  [mi ,  tii  , . . . ,  mk,nf.]  if  the  scheme  consists 
of  relation  names,  say  Ri,..  .,Rf^,  respectively  of  type  [mi,  ni], . . . ,  [mk,nk].  A 
syntactic  database  instance  is  a  mapping,  1,  assigning  to  each  relation  name,  R, 
of  a  scheme,  S,  a  syntactic  relation  X{R)  of  the  same  type.  A  syntactic  relation 
of  type  [m,7i]  is  a  finite  set  of  tuples  of  the  form  {vi, . . .  ,Vm]^{xi, . . .  ,Xn)), 
with  vi,...,Vm  non-spatial  values  of  some  domain,  U,  and  (p{xi  ,...,Xn)  a  real 
formula  with  n  free  variables. 

The  semantics  of  a  syntactic  database  instance,  J,  over  a  database  scheme,  S, 
is  the  mapping,  I,  assigning  to  each  relation  name,  R,  in  S  the  semantic  relation 
I{I{R)).  Given  a  syntactic  relation,  r,  the  semantic  relation  I{r)  is  defined  as 
'  •  •  ’  X  {(a^i,  •  ■ .  ,Xn)  \  t.(p{xi,. . .  This  subset  of  x  R^ 

can  be  interpreted  as  a  possibly  infinite  (m  -h  n)-ary  relation,  called  semantic 
relation,  the  tuples  of  which  are  called  semantic  tuples. 

Example  1.  The  example  in  Figure  1  shows  a  spatial  database  representing  geo¬ 
graphical  information  about  Belgium.  □ 

We  consider  a  query  of  signature  [mi ,  m , . . . ,  mjk ,  n^]  [m,  n]  to  be  a  mapping 

from  database  instances  of  a  spatial  database  scheme  of  type  [mi ,ni, . . . ,  mk,  nk\ 
to  database  instances  of  a  spatial  database  scheme  of  type  [m,  n]  that  can  be 
regarded  in  a  consistent  way  both  at  the  syntactic  and  semantic  level,  and  is 
computable  at  the  syntactic  level. 

In  this  context,  we  define  the  query  language  FO  -I-  poly  as  the  language 
obtained  by  adding  to  the  language  of  real  formulae  the  following:  (i)  a  totally 
ordered  infinite  set  of  variables  called  non-spatial  variables,  disjoint  from  the  set 
of  real  variables,  (n)  atomic  formulae  of  the  form  vi  =  V2,  with  vi  and  V2  non- 
spatial  variables,  (Hi)  atomic  formulae  of  the  form  R{vi , . . .  ,Vm;xi, . . .  ,Xn),  with 
j  •  •  •  5  non-spatial  variables,  xi,...,Xn  real  variables,  and  R  a  relation  name 
of  type  [m,n],  and  finally  (iv)  universal  and  existential  quantification  of  non- 
spatial  variables.  A  query  of  signature  [mi ,  ni , . . . ,  m^t ,  Uk]  [m,  n]  is  definable 

in  FO  +  poly  if  there  exists  an  FO  -j-  poly  formula  ^  with  m  free  value  variables 
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Regions 


Name 

Geometry 

Brussels 

Flanders 

Walloon  Region 

{y  <  13)  A{x<  11)  A{y>  12)  A{x>  10) 

(y  <  17)  A  (5x  —  y  <  78)  A  (x  —  14y  <  —150)  A{x  +  y  >  45)A 
(3x  -4y>  -53)  A  (-((y  <  13)  A  (x  <  11)  A  {y  >  12)  A  (x  >  10))) 
((x  -  Uy  >  -150)  A(y<  12)  A  (19x  -\-7y<  375)  A(x-2y<  15)A 
(5x  +  %  >  89)  A  (x  >  13))  V  ((-X  +  3?/  >  5)  A  (x  +  ?/  >  45)A 
(x  -  14^  >  -150)  A  (x  >  13)) 

Rivers 


Name 

Geometry 

Meuse 

{{y  < 

17)  A(6x-y<  78)  A{y>  12))  V 

((y< 

12)  A  (x  -  3^  =  6)  A  (t,  >  11))  V 

((y< 

11)  a  Ix  —  2y  =  —5)  A{y  >  9))  V 

((y< 

9)  A  (x  =  13)  A{y  >  6)) 

Scheldt 

((y  < 

17)  A  (x  +  2/  =  26)  A{y>  16))  V 

((y< 

16)  A  (2x  —  3/  =  4)  A  (j/  >  14))  V 

((^  < 

9)  A  (x  >  7)  A  (3/  =  14))  V 

((y< 

14)  A  (-3x  +  23/  =  7)  A  (3/  >  11))  V 

((y  < 

11)  A  (2x  + 3,  =  21)  A  (3)  >9)) 

Name 

Geometry 

Antwerp 

(x  =  10)  A{y  ~  16) 

Bastogne 

(x  =  19)  A  (2/  =  6) 

Bruges 

(x  =  5)  A  (2/  =  16) 

Brussels 

(x  =  10.5)  A  (2/  =:  12.5) 

Charleroi 

(x  =  10)  A  (2/  =  8) 

Hasselt 

(x  =  16)  A  {y  —  14) 

Liege 

(x  =  17)  A  (2/  =  11) 

Fig.  1.  Example  of  a  (linear)  spatial  database. 
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and  n  free  real  variables  such  that,  for  every  input  database  instance  of  sig¬ 
nature  {{vi,  ,  .  .  ,Vm,Xi,  .  .  .  ,Xn)  |  ^{vi,  .  .  .  ,Vm,Xi,  .  .  .  ,Xn)} 

evaluates  to  the  corresponding  output  database,  which  is  of  type  [m,n]. 

Examples.  Assuming  that  5  is  a  relation  of  type  [0,2],  i.e.,  a  semi-algebraic  set 
in  the  plane,  the  FO  +  poly-formula 


(3xi)(32/i)(3x2)(32/2)(3a:3)(32/3)(3A)(3/x)(3z/)(5(xi,2/i)  A5(x2,2/2)  A5(a:3,2/3)  A 
A  >  OA^  >  OAi/  >  0  A  X j/  =  lAx  =  Aa;i-{-  fiX2  -f  i^x^  Ay  =  Xyi  -f-  fiy2  i/yz . 

defines  the  convex-hulE  query  of  signature  [0, 2]  [0, 2]  which  associates  with 

S  its  convex  hull.  □ 

Real  formulae  not  containing  non-linear  polynomials  are  called  linear  formulae. 
Point  sets  defined  by  linear  formulae  are  called  semi-linear  sets. 

The  linear  spatial  data  model  is  defined  in  the  same  way  as  the  general 
spatial  data  model  above  using  linear  formulae  instead  of  real  formulae.  Simi- 
larly,  linear  queries  can  be  defined.  Notice  that  a  general  query  induces  a  linear 
query  if  the  query  restricted  to  linear  database  instances  is  linear.  Observe  that 
the  convex-hull  query  (Example  2)  induces  a  linear  query.  Queries  of  signature 
,  mfc,  nfc]  — >  [0, 0]  are  called  Boolean  queries,  because  the  sets  {()}  and 
{}  can  be  seen  as  encoding  the  truth  values  true  and  false,  respectively.  Since 
both  these  sets  are  semi-linear,  every  Boolean  query  induces  a  linear  query. 

A  very  appealing  linear  query  language  for  the  linear  spatial  data  model, 
called  FO  -I-  linear,  is  obtained  from  FO  +  poly  by  only  allowing  linear  formulae 
rather  than  real  formulae. 

Example  3.  The  following  FO  -I-  linear  formula  defines  a  Boolean  (and  hence 
linear)  query  of  signature  [0,  2]  ^  [0, 0]  deciding  whether  S  is  convex: 

(Vxi)(V2/i)(Va;2)(V2/2)(Vx3)(V2/3)(6'(xi,2/i)  A  S{x2,y2)  A 

2x3  =  xi  +  X2  A  2^3  =  2/1  +  2/2  5'(a:3,?/3).  □ 

We  prove  in  Section  4,  however,  that  not  every  linear  query  definable  in  FO+ poly 
is  definable  in  FO-b linear.  (In  particular,  we  will  show  that  the  convex  hull  query 
introduced  in  Example  2  is  not  definable  in  FO  -f  linear.)  Therefore,  it  makes 
sense  to  define  FO-bpoly^^”^  as  the  set  of  FO  -I-  poly-definable  queries  inducing 
linear  queries.  Thus,  the  set  of  queries  definable  in  FO-bpoly^*”  is  a  strict  subset 
of  the  set  of  queries  definable  in  FO  -h  poly. 

Throughout  the  paper,  we  use  vector  notation  to  denote  points.  In  this  nota¬ 
tion,  formulae  should  be  interpreted  coordinate-wise.  Hence,  ~i(x  =  0)  indicates 
that  X  is  not  the  origin  of  the  coordinate  system,  whereas  x  ^  0  denotes  that 
none  of  the  coordinates  of  x  equals  0. 

^  Let  A  C  R”.  The  convex  hull  of  A  is  the  smallest  convex  set  of  R”  containing  A.  In 
particular,  the  convex  hull  of  a  semi-linear  set  is  a  semi-Hnear  set. 
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3  Expressiveness  of  FO  +  linear 

In  this  section,  we  discuss  the  expressiveness  of  the  query  language  FO  +  linear. 
To  simplify  the  discussion,  we  focus  on  purely  spatial  queries,  i.e.,  queries  acting 
on  linear  databases  consisting  of  relations  of  a  type  of  the  form  [0,ri]. 

First,  we  briefly  review  some  known  results  involving  linear  queries  com¬ 
putable  in  FO  +  linear. 

The  following  operations  on  semi-linear  sets  can  be  defined  rather  trivially  in 
FO  + linear:  union,  intersection,  difference,  complement,  and  projection.  In  gen¬ 
eral,  any  affine  transformation  of  semi-linear  sets  can  be  defined  in  FO  -f  linear. 
In  [26],  FO-h  linear  expressions  are  given  for  the  Boolean  queries  checking  bound¬ 
edness,  convexity,  and  discreteness  of  semi-linear  sets.  The  expressive  power  of 
FO-H linear  unfolds  completely,  however,  when  topological  properties  of  geometri¬ 
cal  objects  are  considered.  The  definitions  of  topological  interior,  boundary,  and 
closure  can  indeed  be  translated  almost  straightforwardly  into  linear  calculus 
expressions.  Hence,  for  example,  the  regularization  of  a  semi-linear  set,  defined 
as  the  closure  of  its  interior,  can  be  computed  in  FO  +  linear,  which  is  of  im¬ 
portance,  since  the  regularized  set  operators  union,  intersection,  and  difference, 
turn  out  to  be  indispensable  in  most  spatial  database  applications  [10,  19,  12]. 
More  generally,  Egenhofer  et  al.  showed  in  a  series  of  papers  [7,  8,  9]  that  a 
whole  class  of  topological  relationships  such  as  disjoint^  in,  contained,  overlap, 
touch,  equal,  and  covered  can  be  defined  in  terms  of  intersections  between  the 
boundary,  interior,  and  complement  of  the  geometrical  objects. 

Another  property  of  geometrical  objects  often  used  in  spatial  database  appli¬ 
cations,  is  dimension.  For  instance,  in  [5],  the  dimension  is  used  to  further  refine 
the  class  of  topological  relationships  defined  by  Egenhofer  et  al.  We  now  show 
that  it  can  be  decided  in  FO  -f  linear  whether  a  given  semi-linear  set  has  a  given 
number  as  its  dimension,  which  is  the  contribution  of  this  section.  Since  there 
are  only  finitely  many  known  possibilities  for  the  dimension  of  a  semi-linear  set, 
it  follows  that  the  dimension  can  actually  be  computed  in  FO  -f  linear. 

Definition  1.  The  dimension  of  a  semi-linear  set  S  of  R”  is  the  maximum  value 
of  d  for  which  there  exists  an  open  d-dimensional  cube  fully  contained  in  S.  The 
dimension  of  the  empty  set  is  defined  as  —1. 

Theorem  2.  The  predicate  dim(5,  d),  in  which  S  is  a  semi-linear  set  ofRT  and 
d  is  a  number,  and  which  evaluates  to  true  if  the  dimension  of  S  equals  d,  can 
be  defined  in  FO  -I-  linear. 

The  correctness  of  this  theorem  follows  from  two  lemmas  we  present  next.  We 
will  use  the  notation  with  S  a  semi  linear  set  of  R"^,  to  denote  the  semi- 

linear  set  {(2:1 ,  •  •  • ,  Xi—i ,  j .  •  • ,  Xji^  I  (3xj)*9(xi , . .  • ,  Xi—i ,  xi,  Xij^\ 5  •  •  •  j  of 
R^“^.  Hence,  'Ki[S)  is  the  projection  of  S  onto  the  z-th  coordinate  hyper- plane 
of  R”. 

Lemma 3.  The  dimension  of  'Ki{S),  with  S  a  d-dimensional  semi-linear  set  of 
R",  is  at  most  d  for  \  <i  <n. 
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Lemma 4.  If  S  is  a  d- dimensional  semi-linear  set  o/R",  with  d  <n,  then  there 
exists  i,  1  <i  <n,  such  that  the  dimension  of  TVi(S)  equals  d. 

The  rather  technical  proof  of  Lemma  4  is  omitted  due  to  space  limitations. 

Now  define  empty(5')  as  the  FO  4-  linear  formula  ->(3x)5(x),  maxdim(5') 
as  the  FO  +  linear  formula  (3x)(3e)(e  0  A  (Vy)(x  -  €<y<x  +  €=^ 

5(y))),  and,  max(d, di,...,dn)  as  the  FO  +  linear  formula  (expression  omitted) 
which  evaluates  to  true  if  d  is  the  maximum  of  dj , . . . ,  Then  the  FO  +  linear 
formula  (d  =  — 1  A  empty(5))  V  (d  =  1  A  maxdim(5'))  V  (d  =  0  A  ^empty(5)  A 
-imaxdim(5'))  clearly  defines  dim(5,  d)  in  R.  In  general,  the  FO  + linear  formula 
{d  =  n  A  maxdim(5))  V  (->maxdim(5)  A  dim(7ri(5),  di)  A  ...  A  dim(7rn(5),  d^)  A 
max(d,  di, . . . ,  d^)),  inductively  defined,  by  Lemma  3  and  4  defines  dim(5,  d)  in 
R". 

Many  interesting  queries  can  be  defined  in  a  natural  way  using  the  dimension 
predicate,  and  are  therefore  also  definable  in  FO  +  linear,  as  is  illustrated  by  the 
following  example. 

Example  4-  The  Boolean  query  which  decides  whether  a  semi- linear  set  5  is  a 
line  or  a  line  segment,  is  definable  in  FO -4- linear,  using  the  following  expression: 

dim(5, 1)  A  (Vx  )(Vy  ){S{x  )AS{y)=^  5((x  +  y  )/2)). 

It  should  be  noted  that  Afrati  et  al.[l]  independently  showed  that  the  line  query 
is  definable  FO  -I-  linear.  Their  solution  does  not  use  the  dimension  predicate.  □ 

A  precise  characterization  of  the  expressive  power  of  FO  +  linear  is  still  open, 
however. 

4  Limitations  of  FO  +  linear 

Recently,  Afrati  et  al.  [2]  established  that  FO-flinear  can  not  define  all  FO-f  poly 
definable  linear  queries: 

Propositions.  The  Boolean  query  on  semi-linear  sets  S  of  ^  which  evaluates 
to  true  if  there  exist  u  and  v  of  S  with  =  1,  is  not  definable  in  FO -I- linear. 

Even  though  the  query  in  Proposition  5  involves  a  non-linear  computation  in 
order  to  evaluate  it,  it  is  nevertheless  a  linear  query  because  it  is  boolean,  and 
therefore  suffices  to  establish  the  incompleteness  of  FO -I- linear  for  the  FO-f  poly- 
definable  linear  queries.  The  query,  however,  is  unsatisfactory  because  it  provides 
little  insight  into  whether  more  natural,  non-boolean  FO  -{-  poly-definable  linear 
queries  are  FO  -f  linear-definable.  Two  such  queries  are  (1)  the  linear  query 
from  semi-linear  sets  of  R’^  to  semi-linear  sets  of  R”  computing  the  convex  hull 
(discussed  in  Example  2  for  n  =  2);  and  (2)  the  Boolean  query  on  semi- linear 
sets  of  R’^  which  evaluates  to  true  if  all  points  in  the  input  are  colinear.  In 
this  section,  we  show  that  the  above  queries  are  not  definable  in  FO  -f  linear  if 
n  >  2.  To  demonstrate  this  claim,  we  build  on  the  following  results  established 
by  Afrati  et  al.  [2]  and  the  present  authors  [26]: 
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Proposition  6.  Let  n>2  and  m  >  3.  Then  the  following  sets  are  not  definable 
in  FO  +  linear; 

1.  {(ui,...,Um)  6  (R")™  I  Um  e  convex-hull({ui, . . . ,  and 

2.  {(ui, . . .  ,u^)  €  I  ui, . . .  ,u^  are  colinear}. 

Even  though  the  undefinability  in  FOH- linear  of  the  sets  defined  in  Proposition  6 
may  suggest  that  the  related  queries  (1)  and  (2)  mentioned  earlier  are  also  non- 
definable  in  FO  +  linear,  this  deduction  is  not  obvious.  To  see  the  caveat,  it 
suffices  to  notice  that  the  sets  defined  in  Proposition  6  are  not  even  semi-linear, 
whereas  the  related  queries  are  obviously  linear.  This  technical  gap  appears  to 
have  been  overlooked  in  both  the  work  of  Afrati  et  al.  [2]  and  previous  work  of 
the  present  authors  [26].  In  what  follows,  however,  we  show  that  there  exists  a 
general  technique  to  link  results  about  the  non-definability  in  FO  -f  linear  of  sets 
to  the  non-definability  of  certain  related  linear  queries. 

Definition?.  Let  P  be  a  semi-algebraic  subset  of  (R”)"^,  m,n  >  1.  Let  k 
be  such  that  0  <  A:  <  m.  Furthermore  assume  that  P  and  k  are  such  that, 
for  each  1  <  /  <  A;,  for  each  sequence  ui,...,u/  in  R”^,  and  for  all  se¬ 
quences  zi,...,4  and  such  that  1  <  n, . . . ,  h,  ji, . . . ,  <  I  and 

{uii , . . . ,  }  =  {Uj^ , . . . ,  Uj,,  }  =  {ui , . . . ,  u/},  the  following  permutation  in¬ 

variance  property  holds  for  all  ujt+i , . . . ,  in  R”: 

The  query  Qp^k  of  signature  [0,n]  — >  [0,n(m  —  A:)]  is  now  defined  as  follows. 
If  S  consists  of  at  most  k  points  of  R’^,  say  S  =  {ui, . . . ,  Ufc}  (ui, . . . ,  ua;  not 
necessarily  all  distinct),  then 

QpA^)  ^  {(uA;+i,...,u^)  I  (ui,...,ufe,ujt+i,...,u^)  G  P}; 
otherwise  QpA^)  empty. 

Observe  that  the  invariance  property  assumed  for  P  and  k  guarantees  that  Qp^k 
is  well-defined. 

Example  5.  1.  Let  P  be  the  set 

{(ui,...,Um)  e  (R")™  I  Un,  e  convex-hull({ui . Um-i})}, 

with  n  >2  and  m  >  3.  Let  k  —  m  —  Then  Qp,k  is  the  linear  query  that 
associates  with  each  set  S  consisting  of  m  —  I  points,  the  convex  hull  of  S, 
and  with  every  other  set  S  the  empty  set.  Notice  that,  by  Property  6,  the  set 
P  is  not  FO  +  lineBiY-definable. 

2.  Let  P  be  the  set 

{(ui,...,u^)G(R")"^|ui,...,u^  are  colinear}, 

with  n  >  2  and  m  >  3.  Let  k  =  m.  Then  Qp^k  can  be  interpreted  as  the 
Boolean  query  which  evaluates  a  semi  linear  set  S  to  true  if  and  only  if  S 
consists  of  m  colinear  points.  Notice  that,  by  Property  6,  the  set  P  is  not 
FO  +  linear-de/ina6/e.  □ 
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We  must  emphasize  that  the  linear  queries  in  Example  5  are  closely  related,  but 
not  identical,  to  the  linear  queries  (1)  and  (2)  in  the  beginning  of  this  section. 
One  can  think  of  the  queries  in  Example  5  as  restrictions  of  the  linear  queries 
(1)  and  (2)  to  certain  finite  sets. 

We  now  prove  the  following  theorem: 

Theorems.  Let  P  be  a  semi-algebraic  subset  of  m,  n  >  1,  and  let  P 

and  k  satisfy  the  conditions  of  Definition  7.  If  P  is  undefinable  in  FO  +  linear, 
then  the  following  holds: 

1.  The  query  Qp^k  is  undefinable  in  FO  +  linear. 

2.  If  Q  is  a  linear  query  from  semi-linear  sets  of  to  semi-linear  sets  of 

(Rn)m-A:  every  semi-linear  set  S  of 'RT ,  Q(S)  —  Qp,k{S)  if 

Qp,k{S)  is  not  empty,  then  Q  is  undefinable  in  FO  -I-  linear. 

Proof.  1.  Assume,  to  the  contrary,  that  the  query  Qp^f.  is  FO  +  linear-definable. 
Then  there  exists  an  FO  -h  linear  formula  ipp^k{R',  ^k+i,-  •  •  with  R  an  ap¬ 

propriate  predicate  name,  such  that,  for  each  semi-linear  set  S  of  R”,  Qp,k{S)  ~ 
{(^fc+i,  •  •  •  I  Ufc+i, . . .  ,Ur„)}.  We  now  argue  that  the  predicate 

name  R  must  effectively  occur  in  (pp^k-  If  this  were  not  the  case,  then  the  query 
associated  with  (pp^f.  would  be  a  constant  function.  This  constant  function  can¬ 
not  yield  the  empty  set,  for,  otherwise,  by  the  definition  of  Qp,k,  P  would  also 
be  the  empty  set,  which  is  obviously  FO  +  linear-definable,  contrary  the  hypoth¬ 
esis  of  the  theorem.  The  constant  function  cannot  yield  a  non-empty  set,  either, 
however,  since  again  by  the  definition  of  Qp^k-)  there  is  an  infinite  number  of 
inputs  for  which  Qp^k  returns  the  empty  set.  Thus  R  must  occur  in  p)p^k- 

Given  the  formula  we  can  construct  the  formula  (pp^k  as  follows.  Let 
xi, . . .  ,x^.  be  variables  that  do  not  occur  in  <pp^k-  Now  replace  every  literal  of 
the  form  R{z)  in  pp^u  by  the  formula  z  ==  xi  V  ■  •  ■  V  z  —  x/t-  Observe  that  the 
formula  (pp^k  is  a  linear  formula  with  free  variables  Xi , . . . ,  x^.  Our  claim  is  that 
the  formula  pp^k  defines  the  set  P,  a  contradiction  with  the  hypothesis  of  the  the¬ 
orem.  Consider  an  m-tuple  (ui, . . .  ,Um)  €  (R^)”^.  From  the  definition  of  Qp^k 
and  pp^k,  we  have  (ui,...,u^)  e  P  (ua:+i,  . . . ,  u^)  e  Op,a:({ui,  . . . ,  Ufc}’), 
whence  (ui,...,u^)  e  P  ^  </:p,A:({ui, . . . ,  ujt};  Ujt+i, . . . ,  u^).  It  follows  from 
the  construction  of  pp^k  from  pp^k  that  (ui, . . . ,  E  P  4=>  (^p,A:(ui,  . , . ,  u^). 

2.  Assume  that  Q  is  FO  +  linear- definable.  Then  there  exists  a  formula 

that  defines  Q.  Given  pq,  we  can  construct  the  formula  pq: 

(^Q(P;xfcpi,...,x^)  (|P|  <  A:  A(/?g(P;XA:pi,..  .,x^))  V  (1P|  >  k  A  false). 

It  is  obvious  that  this  expression  for  pq  can  be  translated  into  proper  FO-f  linear 
syntax.  It  now  follows  from  the  properties  of  Q  that  the  formula  pq  defines  the 
query  Qp^k-  Hence,  it  would  follow  that  Qp^k  is  FO  -1-  linear-definable,  which  is 
impossible  by  the  first  part  of  the  theorem.  □ 
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Theorem  8  has  the  following  corollary: 

Corollary  9.  The  convex  hull  query  (1)  and  the  colinearity  query  (2)  are  not 
definable  in  FO  +  linear. 

5  Extensions  of  FO  +  linear 

Although,  in  Section  3,  it  is  shown  that  a  wide  range  of  useful,  complex  linear 
queries  can  be  defined  in  FO  +  linear,  the  language  lacks  the  expressive  power 
to  define  some  important  FO+poly^^”^  queries,  as  is  clearly  demonstrated  in  Sec¬ 
tion  4.  Hence  the  search  for  languages  that  capture  such  queries  is  important. 
Without  such  languages,  we  would  indeed  be  hard-pressed  to  substantiate  the 
claim  that  the  linear  model  is  to  be  adopted  as  the  fundamental  model  for  ap¬ 
plications  involving  linear  geometric  objects. 

The  obvious  way  to  obtain  a  query  language  which  is  complete  for  the 
FO+poly^*”  queries  is  to  discover  an  algorithm  that  can  decide  which  FO  -f  poly 
formulae  induce  linear  queries.  Unfortunately,  such  an  algorithm  does  not  exist: 

Theorem  10.  It  is  undecidable  whether  an  arbitrary  FO  +  poly  formula  induces 
a  linear  query. 

Proof.  (Sketch.)  The  proof  is  a  variation  of  a  proof  by  Paredaens  et  al.  [22] 
concerning  undecidability  of  genericity  in  FO  +  poly  (Theorem  1,  pp.  285).  The 
V* -fragment  of  number  theory  is  undecidable  since  Hilbert’s  10th  problem  can  be 
reduced  to  it.  Encode  a  natural  number  n  by  the  one-dimensional  semi- algebraic 
set  enc{n)  :=  {0, . . .  ,n},  and  encode  a  vector  of  natural  numbers  (rii, . .  .,nk) 

by  enc{ni)  U  (enc(n2)  -h  ni  +  2)  U  . . .  U  {enc{nk)  -f  ni  +  2  H - h  Uk-i  +  2).  The 

corresponding  decoding  is  first-order.  We  then  reduce  a  V*-sentence  (Vx)9p(x)  of 
number  theory  to  the  following  query  of  signature  [0, 1]  [0,0]: 

if  R  encodes  a  vector  x  then  if  </?(x)  then  0  else  {(^,1;)  =  1}  else  0. 

This  query  is  definable  in  FO  4-  poly  and  induces  a  linear  query  if  and  only  if 
the  V*-sentence  is  valid.  ^ 

Theorem  10  shows  that  a  top-down  approach  to  discover  a  useful  linear  sub-query 
language  is  difficult.  Observe  that  Theorem  10  still  allows  the  isolation  of  a  subset 
of  the  FO  +  poly  formulae  that  define  FO+poly^""  queries,  in  the  same  way  that 
the  undecidability  of  safeness  in  the  relational  calculus  is  not  in  contradiction 
with  the  existence  of  a  sub-language  of  the  relational  calculus  which  has  precisely 
the  expressive  power  of  the  safe  relational  calculus  queries.  [25] 

In  this  section,  we  therefore  take  a  bottom-up  approach  to  discover  restric¬ 
tions  of  FO+poly^*”  that  are  strictly  more  expressive  than  FO  +  linear.  The 
basic  idea  is  to  extend  FO  4-  linear  with  certain  linear  operators,  such  as  the 
colinearity  or  the  convex- hull  query.  It  is  important  to  observe  in  this  respect 
how  careful  we  have  to  be  to  avoid  creating  languages  that  are  no  longer  linear. 
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A  too  liberal  syntax  can  indeed  lead  to  the  definability  of  non-semi-linear  sets 
associated  to  these  operators,  such  as  the  sets  exhibited  in  Proposition  6.  This 
in  turn  can  have  as  a  consequence  that  the  language  obtains  the  full  expressive 
power  of  FO  -h  poly,  as  is  shown  by  the  following  example  [26]. 

Example  6.  Extending  FO  -f  linear  with  the  convex-hull  predicate  (or  with  the 
colinearity  predicate  which  can  be  derived  from  the  former)  as  defined  in  [26] 
leads  to  a  language  with  the  expressive  power  of  FO-hpoly.  [26],  the  reason  being 
that  the  predicate  product(x,2/,  defined  hy  z  =  xy  is  can  be  expressed  as^ 

-i(3!u)((colinear(x,  62,  u)  A  colinear(y,  z,  u))), 

where  x  -  (a;,0),  y  (0,2/),  z  -  (z,0),  u  =  (^1,7x2),  and  62  =  (0,1).  □ 

We  now  proceed  with  showing  how  FO  linear  can  be  extended  with  operators 
in  a  safe  way.  The  subtle  point  of  our  definition  consists  in  disallowing  free  real 
variables  in  set  terms.  So,  even  though  a  set-term  might  have  free  value  variables, 
it  is  disallowed  to  have  free  real  variables. 

An  operator  is  defined  to  be  an  FO+poly^*^  query.  The  signature  of  an  op¬ 
erator  is  the  signature  of  that  query. 

Let  (9  be  a  set  of  operator  names  O  typed  with  a  signature,  each  of  which 
represents  an  operator  op(0)  of  the  same  signature. 

The  query  language  FO  +  linear  -f  (9  is  then  defined  as  an  extension  of  FO  -F 
linear,  as  follows.  First,  we  extend  the  terms  of  FO  -h  linear  with  set  terms: 

-  If  ^  is  an  FO  +  linear  -f  (9  formula  with  n  free  real  variables  xi, . . .  ^Xn  and 

m  free  value  variables  and  if  A:  <  m,  then 

{(^1 ,  •  •  • ,  “yfe ,  , . . . ,  )  I  ^{vi , . . . ,  -ym ,  , . . . ,  ) } 

is  a  set  term  of  type  [A:,  n].  Observe  that  of  the  value  variables,  . . .  ^Vm 
occur  free,  while  all  real  variables,  Xi, . . .  occur  hounded  in  the  set  term.^ 

-  If  (9  is  an  operator  name  in  (9  of  type  [mi,  m, . . . ,  m;t,  rxjfc]  ^  [m,n],  and 
Si,...,Sk  are  set  terms  of  types  [mi,ni], . . . ,  [mfc,njt],  respectively,  then 

0(5i,...,5,) 

is  a  set  term  of  type  [m,  n]  with  as  free  variables  those  in  the  union  of  all 
free  variables  in  S\  through  Sk  (which  are  all  value  variables). 

Finally,  we  extend  the  atomic  formulae  of  FO  -I-  linear: 

-  Let  5  be  a  set  term  of  type  [m,7i]  .  Then  S{vi, . . .  . . .  ,Xn),  with 

5  •  •  • ,  value  variables  and  xi , . . . ,  a:„  real  variables,  is  an  atomic  formula 
with  free  variables  vi,. . .  ,Vm,  xi, ...  ,Xn  union  the  free  (value)  variables  of 
5. 

^  The  quantifier  “3!”  should  be  read  as  “there  exists  exactly  one”  and  can  be  expressed 
in  FO  in  a  straightforward  manner. 

®  Observe  that  this  definition  allows  us  to  interpret  a  predicate  name  R  of  type  [A:,  n] 
as  a  set  term  of  type  [A:,  n]. 
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When  actual  values  are  substituted  for  the  free  variables,  a  set  term  of  type  [m,  n] 
represents  a  subset  of  x  BP.  Consider  then  an  atomic  formula  of  the  form 
S{vi,. . . ,  Vm,xi 5  •  •  • ) this  atomic  formula  evaluates  to  true  if  the  evaluation 
oi{vi,...,VmyXi,...,Xn)  belongs  to  the  set  represented  by  the  set  term  S.  The 
full  semantics  of  FO  +  linear  +  O  is  now  straightforward  to  define. 

If  we  constrain  the  operator  in  O  to  be  FO  +  linear-definable,  we  can  prove 
the  following  safety  property  by  induction  on  the  structure  of  FO  +  linear  +  O- 
formulae: 

Theorem  11.  The  query  language  FO  +  linear  -f  O  only  expresses  FO  +  poV"""- 
definable  queries. 

The  syntactic  restriction  that  set  terms  contain  only  free  value  variables  is 
essential  for  Theorem  11  to  hold;  otherwise,  e.g.,  the  formula  in  Example  6  could 
be  expressed  in  FO  -1-  linear+colinear,  whence  FO  +  linear H-colinear  would  have 

the  full  expressive  power  of  FO  +  poly. 

Without  going  into  details,  we  mention  that  it  is  possible  to  define  an  al¬ 
gebraic  query  language  equivalent  to  FO  +  linear  +  (9  by  extending  the  linear 
algebra  [26]  with  the  operators  represented  by  O.  This  equivalence  result  forms  a 
theoretical  justification  for  the  approach  Giiting  has  taken  with  the  development 
of  the  ROSE-algebra.  [13,  14,  15,  16],  which  is  extending  the  relational  algebra 
with  a  class  of  spatial  operators. 

Finally,  we  give  an  example  of  an  FO  + linear -f*C9  query  language  in  which  we 
can  express  the  queries  (1)  and  (2)  in  the  beginning  of  Section  4.  Thereto,  define 
an  infinite  set  of  operator  names  seg^  of  signature  [0,  k]  ->  [0,  k]  and  associate 
with  each  operator  name  seg*=  the  operator  op(seg^)  defined  by  op(seg)*'(5)  = 
{x  G  RM  (3y)(3z)(5(y)  A  5(z)  Axe  [y,z]})  for  each  semi-linear  set  5  of  R\ 
Let  5  be  the  set  of  all  seg^  A:  >  0.  Now  let  R  be  a  predicate  representing  a 
semi-linear  set  of  R^.  The  FO  +  linear  -h  S  formula 

seg*(seg^(...seg*^(R)))(x). 

> - - ^ 

k  times 

computes  the  convex  hull  of  the  set  represented  by  R.  Using  the  convex-hull 
query  as  a  macro,  we  can  express  co-linearity  by  the  following  FO  -h  linear  -I-  S 
formula: 

(3d)(dim({x  |  convex-hull(5')(x)},  d)  A  d  <  1). 

6  Conclusion 

In  this  paper  we  studied  languages  that  define  FO-hpoV*’^  queries.  Amongst 
these  languages,  the  most  natural  one  is  FO  -H  linear.  For  this  language,  we 
showed  that  non-trivial  FO+poly'"”^  queries,  such  as  the  dimension  query,  can 
be  defined  in  it,  but  we  also  demonstrated  that  important  FO+poly^*’^  queries, 
such  as  the  convex  hull,  cannot  be  defined.  These  latter  results  led  us  to  the 
introduction  of  extensions  of  FO  -1-  linear  with  FO+poly^^^-definable  operators. 
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The  crucial  part  of  this  construction  was  requiring  that  operators  can  only  be 
applied  to  set  terms  without  free  real  variables.  As  we  showed  with  a  counter¬ 
example,  our  construction  can  lead  to  unsafe  query  languages  if  that  restriction 
is  lifted. 

We  conclude  by  mentioning  the  two  most  prominent  open  problems  raised 
by  this  paper:  (z)  does  there  exist  a  syntactic  restriction  on  FO  +  poly  formulae 
that  yields  a  sublanguage  of  FO  +  poly  which  is  sound  and  complete  for  the 
queries;  and  (n)  does  there  exist  an  extension  of  FO  -f 
linear  (or  other  sublanguages  of  FO  +  poly)  with  operators  that  yields  soundness 
and  completeness? 
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Analysis  of  Heuristic  Methods  for  Partial 
Constraint  Satisfaction  Problems* 
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Abstract,  Problems  that  do  not  have  complete  solutions  occur  in  many 
areas  of  application  of  constraint  solving.  Heuristic  repair  methods  that 
have  been  used  successfully  on  complete  CSPs  can  also  be  used  on  over¬ 
constrained  problems.  A  difficulty  in  analyzing  their  performance  is  the 
uncertainty  about  the  goodness  of  solutions  returned  in  relation  to  the 
optimal  (best  possible)  solutions.  This  difficulty  can  be  overcome  by  test¬ 
ing  these  procedures  on  problems  that  can  be  solved  by  complete  meth¬ 
ods,  which  return  certifiably  optimal  solutions.  With  this  experimental 
strategy,  comparative  analyses  of  hill-climbing  methods  were  carried  out 
using  anytime  curves  that  could  be  compared  with  known  optima.  In 
addition,  extensive  analysis  of  parameter  values  for  key  strategies  such 
as  random  walk  and  restarting  could  be  done  precisely  and  efficiently 
by  allowing  local  search  to  run  until  a  solution  was  discovered  that  was 
known  to  be  optimal,  based  on  earlier  tests  with  complete  methods.  An 
important  finding  is  that  a  version  of  min-conflicts  that  incorporates  the 
random  walk  strategy,  with  a  good  value  for  the  walk  probability  appears 
to  be  as  efficient  in  this  domain  as  several  of  the  more  elaborate  methods 
for  improving  local  search  that  have  been  proposed  in  recent  years. 


1  Introduction 

Constraint  satisfaction  problems  (CSPs)  involve  finding  an  assignment  of  values 
to  variables  that  satisfies  a  set  of  constraints  between  these  variables.  In  many 
important  applications  the  problems  may  be  over  constrained,  so  that  no  com¬ 
plete  solution  is  possible.  In  these  cases,  ‘partial’  solutions  (i.e.,  assignments  that 
do  not  satisfy  all  constraints  in  the  problem)  may  still  be  useful  if  a  sufficient 
number  of  the  most  important  constraints  are  satisfied. 

An  important  class  of  partial  constraint  satisfaction  problems  (PCSPs)  is 
the  maximal  constraint  satisfaction  problem  (MAX-CSP),  in  which  the  goal  is 
to  find  assignments  of  values  to  variables  that  satisfy  the  maximum  number 
of  constraints.  Since  this  problem  involves  assigning  (equal)  penalties  for  each 
constraint,  methods  for  solving  it  can  be  readily  generalized  to  accomodate  con¬ 
straint  preferences  or  weighted  constraints. 

*  This  material  is  based  on  work  supported  by  the  National  Science  Foundation  under 
Grant  Nos.  IRI-9207633.  and  IRI-9504316.  Some  of  this  material  was  presented  at 
the  Workshop  on  Overconstrained  Systems  at  CP95. 
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In  recent  years  methods  have  been  developed  for  solving  CSPs  which  are 
based  on  local  improvements  of  an  initial  assignment  that  violates  an  unspeci¬ 
fied  number  of  constraints,  rather  than  on  incremental  extensions  of  a  fully  con¬ 
sistent  partial  solution.  These  procedures  are  hill-climbing  methods  that  search 
a  space  of  solutions,  following  local  gradients  based  on  the  number  of  violated 
constraints.  In  some  cases  these  local  or  heuristic  repair  methods  have  solved 
problems  that  are  much  larger  than  those  that  can  be  solved  by  complete  meth¬ 
ods  that  involve  backtrack  search  [8,  11]. 

In  view  of  these  results,  it  seems  likely  that  heuristic  repair  techniques  can 
be  applied  successfully  to  overconstrained  problems,  including  MAX-CSPs.  Since 
these  are  optimization  problems,  the  basic  question  is  whether  such  local  search 
methods  will  return  optimal  (here,  maximal)  or  near-optimal  solutions  to  these 
problems  after  a  reasonable  amount  of  time.  One  reason  for  expecting  them  to 
do  well  is  that  hill-climbing  methods  do  not  embody  the  concept  of  a  complete, 
or  fully  consistent,  solution;  instead  they  rely  on  notions  of  better  or  worse 
in  comparing  different  assignments  of  values  to  variables.  Since  this  concept 
of  relative  goodness  does  not  change  when  PCSPs  are  tested  rather  than  CSPs, 
methods  that  rely  on  it  should  not  be  hampered  in  this  new  domain.  In  contrast, 
the  different  notions  of  goodness  used  by  complete  CSP  algorithms  (based  on 
the  Boolean  AND  function)  and  by  algorithms  for  PCSPs  such  as  branch  and 
bound  (based  on  additive  penalty  functions)  make  the  latter  much  harder  to 
solve  with  these  methods  (cf.  [10]). 

To  evaluate  local  search  methods,  one  must  be  able  to  assess  the  quality  of 
solutions  that  they  find.  For  problems  with  complete  solutions  this  evaluation 
is  easy,  since  an  optimal  solution  is  one  that  is  complete,  i.e.,  it  must  satisfy 
all  the  constraints  in  the  problem.  Quality  can  then  be  judged  in  terms  of  the 
difference  between  the  solution  found  and  a  complete  solution.  In  this  case,  the 
number  of  violated  constraints  or  the  sum  of  the  weights  of  these  constraints  is 
a  straightforward  measure  of  quality. 

In  contrast,  with  overconst  rained  problems  one  cannot  determine  optimal¬ 
ity  a  priori.  However,  rigorous  assessment  is  possible  if  complete  methods  can 
be  used  that  return  guaranteed  optimal  solutions.  Solutions  found  by  heuristic 
repair  methods  can  then  be  compared  with  those  found  by  complete  methods, 
and  differences  in  quality  can  be  assessed  in  the  same  way  as  for  problems  with 
complete  solutions. 

Unfortunately,  the  size  of  problems  for  which  this  is  possible  is  restricted 
because  of  limits  on  the  capacity  of  complete  methods  to  solve  large  problems. 
However,  complete  methods  are  now  available  for  solving  all  problems  in  some 
classes  of  MAX-CSPs  with  30-60  variables.  This  allows  unbiased  sampling  of 
problems  that  are  large  enough  to  be  interesting  in  some  applications,  and  may 
also  allow  some  assessment  of  trends  with  increasing  problem  size  (as  well  as 
other  parameters  such  as  density  and  constraint  tightness).  For  larger  problems 
of,  say,  100  variables,  it  may  be  possible  to  solve  a  portion  of  the  problems 
with  these  methods.  But  in  this  case  the  analysis  is  hampered  by  the  possible 
introduction  of  bias,  since  search  methods  may  perform  differently  on  easy  and 
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hard  problems  and  here  only  the  former  can  be  properly  evaluated. 

Thus,  in  addition  to  its  primary  concern  with  effectiveness  of  heuristic  repair 
techniques,  this  paper  is  also  a  study  of  how  to  evaluate  heuristic  methods  when 
problems  do  not  have  complete  solutions. 

The  next  section  gives  some  background  pertaining  to  CSPs  and  describes  the 
heuristic  repair  methods.  Section  3  outlines  the  basic  experimental  methodology, 
and  Section  4  gives  experimental  results  with  heuristic  methods  for  MAX-CSPs. 
Section  5  gives  conclusions. 

2  Algorithms 

A  constraint  satisfaction  problem  (CSP)  involves  assigning  values  to  variables 
that  satisfy  a  set  of  constraints  among  subsets  of  these  variables.  The  set  of  values 
that  can  be  assigned  to  one  variable  is  called  the  domain  of  that  variable.  In  the 
present  work  all  constraints  are  binary,  i.e.,  they  are  based  on  the  Cartesian 
product  of  the  domains  of  two  variables.  A  binary  CSP  is  associated  with  a 
constraint  graph,  where  nodes  represent  variables  and  arcs  represent  constraints. 
If  two  values  assigned  to  variables  that  share  a  constraint  are  not  among  the 
acceptable  value-pairs  of  that  constraint,  this  is  an  inconsistency  or  constraint 
violation. 

For  MAX-CSPs,  the  number  of  constraint  violations  in  an  assignment,  termed 
the  distance  of  a  solution,  is  used  as  a  measure  of  quality.  Thus,  better  solutions 
are  those  with  lower  distances,  and  solutions  with  the  lowest  possible  distance, 
within  the  set  of  possible  assignments,  are  optimal  solutions. 

Heuristic  repair  procedures  for  CSPs  begin  with  a  complete  assignment  and 
try  to  improve  it  by  choosing  alternative  assignments  that  reduce  the  number 
of  constraint  violations.  An  important  example  is  the  min-conflicts  procedure, 
which  was  the  first  hill-climbing  method  to  be  tested  on  CSPs  [7].  This  procedure 
has  two  phases.  The  first  is  a  greedy  preprocessing  step,  in  which  assignments 
are  made  to  successive  variables  so  as  to  minimize  the  number  of  constraint 
violations  with  values  already  chosen.  This  is  followed  by  a  hill-climbing  phase, 
in  which,  at  each  step,  a  variable  is  chosen  whose  assignment  conflicts  with  one 
or  more  assignments,  and  a  value  is  chosen  for  that  variable  that  minimizes 
the  number  of  conflicts.  Normally  both  variable  and  value  selection  involve  an 
element  of  randomness:  variables  are  chosen  at  random  from  all  those  that  have 
conflicts,  and  values  are  chosen  at  random  from  the  set  of  min-conflict  values  for 
the  selected  variable. 

Limitations  of  the  basic  min-conflicts  procedure  for  many  kinds  of  CSPs,  in¬ 
cluding  random  CSPs  and  coloring  problems  have  often  been  demonstrated  (e.g., 
[14]).  Here  it  will  be  shown  that  its  major  drawbacks,  which  are  also  found  with 
MAX-CSPs,  can  be  overcome  by  adding  well-known  strategies  for  escaping  local 
minima.  These  strategies  are:  (i)  a  retry  strategy,  in  which  the  procedure  starts 
again  with  a  new  assignment  after  a  certain  number  of  changes  in  the  original 
one,  and  (ii)  a  random  walk  strategy,  in  which,  after  choosing  a  variable  with  a 
conflicting  value  as  before,  a  new  value  is  chosen  at  random  for  assignment,  with 
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probability,  p,  while  the  usual  min- conflicts  procedure  is  followed  with  probabil¬ 
ity,  1  -p.  In  the  initial  studies  (Section  4.1),  values  for  retry  and  walk  are  based 
on  published  sources;  the  number  of  assignments  before  restarting  is  five  times 
the  number  of  variables  [3],  while  the  probability  of  a  random  choice  in  the  walk 
procedure  is  always  0.35  [11].  Since  these  values  were  originally  used  for  SAT 
problems,  there  is  no  reason  to  think  that  they  are  optimal  for  CSPs.  Hence,  later 
tests  include  systematic  variation  of  the  values  for  assignments  before  restarting 
and  for  the  walk  probability. 

The  best  versions  of  min-conflicts  are  also  compared  with  other  strategies 
originally  proposed  for  complete  CSPs:  (i)  GSAT,  in  which  the  next  reassign¬ 
ment  is  one  that  yields  a  maximal  improvement  in  the  solution  (i.e.,  a  maximal 
decrease  in  the  number  of  constraint  violations),  and  which  also  incorporates  a 
retry  strategy  [6],  (ii)  breakout,  in  which  constraint  violations  associated  with  a 
current  local  minimum  are  penalized  by  weighting  them  more  strongly,  so  that 
alternative  assignments  are  then  preferred  [9],  (iii)  EFLOP,  in  which  on  encoun¬ 
tering  a  local  minimum,  one  variable  in  conflict  is  given  another  assignment  at 
random  and  then  adjacent  variables  are  given  assignments  that  reestablish  con¬ 
sistency  with  those  just  reassigned  (the  present  version  also  uses  the  “EFLOP 
heuristic”,  in  which  variables  are  chosen  to  minimize  the  number  of  new  vi¬ 
olations)  [14],  (iv)  weak  commitment  search,  which  starts  with  an  initial  full 
assignment  and  then  tries  to  extend  a  partial  set  of  variables  that  is  completely 
consistent,  and  which  restarts  this  process  whenever  a  variable  is  found  that  can¬ 
not  be  added  without  engendering  an  inconsistency  [15].  In  the  original  version, 
discarded  partial  solutions  were  added  to  the  problem  as  nogoods;  however,  it 
was  found  here  that  the  proliferation  of  nogoods  slowed  down  processing  consid¬ 
erably  even  with  an  efficient  storage  and  lookup  mechanism.  Hence,  this  feature 
is  not  used  in  the  present  version,  without  apparent  decline  in  effectiveness.  In 
addition,  all  procedures  begin  with  an  assignment  generated  by  the  greedy  pre¬ 
processing  procedure  used  with  min-conflicts  in  order  to  facilitate  comparisons. 
Some  limited  observations  have  also  been  made  with  simulated  annealing  [5]  and 
a  version  of  min-conflicts  that  incorporates  tabu  search  procedures  [2]. 

In  the  present  work  branch  and  bound  versions  of  CSP  algorithms  are  used 
to  determine  the  optimal  number  of  constraint  violations  in  the  problems.  These 
algorithms  are  described  in  detail  in  [1,  13]. 

3  Experimental  Methods 

Random  CSPs  were  generated  using  a  “probability  of  inclusion”  (PI)  model  of 
generation  (cf.  [1]).  In  the  present  case  the  number  of  variables  was  fixed,  as 
well  as  domain  size.  Each  possible  constraint  and  constraint  value  pair  was  then 
chosen  with  a  specified  probability.  These  problems  had  either  30  or  100  vari¬ 
ables,  and  domains  were  fixed  at  either  5  or  10.  All  30- variable  problems  could 
be  run  to  completion  with  the  branch  and  bound  methods  used,  so  that  quality 
of  solutions  returned  by  heuristic  methods  could  be  evaluated  by  comparisons 
with  known  optimal  distances.  Further  details  on  parameter  values  for  specific 
problem  sets  are  given  in  Section  4. 
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In  addition,  ‘geometric’  CSPs  were  generated,  using  the  procedure  described 
in  [5].  Briefly,  variables  are  given  random  coordinates  within  the  unit  square; 
when  coordinate  points  are  within  some  specified  distance  of  each  other,  a  con¬ 
straint  is  added  between  the  associated  variables.  In  contrast  to  homogeneous 
random  problems,  these  problems  exhibit  more  clustering  of  constraints  in  the 
constraint  graph,  a  feature  that  can  increase  the  difficulty  for  hill-climbing  meth¬ 
ods  [6]. 

For  all  classes  of  problems,  the  sample  size  was  25. 

There  were  two  types  of  experiment.  In  the  first,  each  heuristic  method  in 
that  experiment  is  run  for  a  fixed  period.  Then,  for  a  set  of  specified  times  the 
number  of  violations  in  the  best  solution  found  so  far  is  averaged  across  the 
sample  of  problems.  The  resulting  average  distances  are  displayed  as  “anytime 
curves”,  that  show  the  quality  of  solution  obtained  after  successively  longer 
durations  of  processing.  In  the  second  type  of  experiment,  heuristic  methods  are 
run  with  a  sufficient  bound  on  the  distance  that  is  equal  to  the  optimal  distance, 
determined  earlier  with  complete  methods.  This  allows  runs  to  be  terminated  as 
soon  as  an  optimal  solution  is  found;  at  this  point  the  next  run  begins.  In  these 
experiments,  the  specified  cutoff  time  is  essentially  set  to  ‘infinity’,  so  that  all 
runs  continue  until  an  optimal  solution  is  found. 

Anytime  curves  reported  in  Section  4.1  are  based  on  the  conflicts  (constraint 
violations)  in  the  initial  solution,  as  well  as  the  number  of  conflicts  in  the  best 
solution  found  after  0.05,  0.1,  0.5,  1,  10,  50  and  100  seconds.  In  the  present 
experiments,  these  curves  are  based  on  means  of  five  runs  of  100  seconds  with 
each  of  the  25  problems  in  a  sample.  Means  reported  in  Section  4.2  for  the  time 
to  find  an  optimal  solution  are  based  on  ten  runs  per  problem.  (In  preliminary 
tests,  standard  deviations  for  ten  means  of  single  runs  on  the  25  problems  in  a 
sample  were  found  to  be  about  5%  of  the  grand  means  of  the  runs  for  each  time 
tested.  This  indicates  that  for  samples  of  this  size  the  mean  is  quite  stable,  even 
for  single  runs.) 

All  procedures  were  coded  in  Common  Lisp  using  Lispworks  by  Harlequin. 
Testing  was  done  on  a  DEC  Alpha  (DEC3000  M300LX). 

4  Experimental  Results 

4.1  Behavior  of  Min-Conflicts  Procedures 

In  the  first  experiments,  four  sets  of  30-variable  problems  were  used  with  ex¬ 
pected  densities  of  either  0.10  or  0.50  and  average  optimal  distances  close  to  2 
or  to  8.5.  (These  distances  were  obtained  at  each  expected  density  by  varying 
the  expected  tightness  of  constraints;  cf.  Figures  1-2  and  Section  4.2.)  Figure  1 
shows  results  based  on  one  problem  set.  This  figure  shows  anytime  curves  for  an 
efficient  branch  and  bound  algorithm  based  on  forward  checking,  with  variable 
ordering  by  decreasing  number  of  constraints  (degree  of  its  node  in  the  constraint 
graph),  and  for  min-conflicts  hill-climbing.  The  latter  finds  markedly  better  solu¬ 
tions  early  in  search.  Since  it  normally  begins  with  a  greedy  preprocessing  step. 
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the  number  of  conflicts  in  the  initial  assignment  is  much  less  than  the  number 
of  conflicts  in  the  first  solution  found  by  branch  and  bound.  For  this  reason,  a 
version  of  min-conflicts  that  starts  with  a  random  assignment  is  also  included 
for  comparison.  In  this  case  the  initial  number  of  conflicts  is  almost  equal  to  that 
for  branch  and  bound.  Nonetheless,  the  anytime  curve  for  hill-climbing  descends 
much  more  rapidly  than  the  one  for  branch  and  bound,  and  after  the  first  second 
of  runtime,  it  finds  solutions  that  are  as  good  as  those  found  by  the  basic  version 
of  min-conflicts. 
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Fig.  1.  Averaged  anytime  curves  for  forward  checking  branch  and  bound  and 
min-conflicts  (with  and  without  greedy  preprocessing).  30-variable  problems,  exp.  den¬ 
sity  =  0.10,  exp,  tightness  =  0.46,  mean  optimal  distance  =  2.08.  Note  the  log  scale 
on  the  abscissa. 


Both  versions  of  min-conflicts  are,  therefore,  more  efficient  in  finding  good 
partial  solutions  than  branch  and  bound  at  short  intervals.  However,  as  in  the 
case  of  complete  CSPs,  min-conflicts  quickly  becomes  ‘stuck’;  within  five  seconds 
it  reaches  a  local  minimum  and  remains  at  this  level  for  the  duration  of  the 
experiment.  In  fact,  it  does  not  seem  to  ever  get  ‘unstuck’  with  these  problems, 
as  indicated  by  experiments  with  a  cutoff  time  of  1000  seconds,  in  which  no 
better  solutions  were  found  after  the  first  few  seconds. 

Figure  2a  shows  anytime  curves  for  the  same  problem  set  as  in  Figure  1, 
when  strategies  for  escaping  local  minima  are  incorporated  into  min-conflicts. 
Although  the  basic  min-conflicts  strategy  is  superior  at  first,  the  curves  for  both 
the  walk  and  retry  strategies  overtake  the  former  after  approximately  one  second, 
and  within  10-50  seconds  an  optimal  solution  is  found  in  the  great  majority  of 
cases  (specifically,  in  105  of  the  125  runs  for  walk  and  119/125  for  retry,  versus 
19/125  for  the  basic  min-conflicts  procedure). 

Figure  2b  shows  corresponding  curves  for  a  set  of  problems  having  the  same 
density  but  a  greater  number  of  violations  in  the  optimal  solution.  The  same 
pattern  of  results  is  observed  as  in  the  former  experiment:  all  three  variants 
show  a  rapid  initial  descent,  but  after  approximately  one  second,  the  basic  version 
reaches  a  plateau,  while  the  other  two  continue  to  improve  and  eventually  surpass 
the  former. 
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Fig,  2.  Averaged  anytime  curves  for  basic  min-conflicts  versus  min-conflicts  with  ran¬ 
dom  walk  and  retry  strategies.  30-variable  problems,  exp,  density  =  0,10.  Problems  in 
a.  had  tightness  =  0.46,  mean  optimal  distance  =  2.08,  Problems  in  b.  had  tightness 
=  0.60,  mean  optimal  distance  =  8.68. 


A  further  experiment  was  run  as  a  more  direct  test  of  the  similarity  of  results 
with  this  heuristic  procedure  for  problems  with  complete  solutions  and  for  PC- 
SPs.  A  single  set  of  100  problems  was  generated  near  the  phase  transition  [12],  so 
that  about  50%  had  solutions.  (Problems  had  30  variables,  domain  size  of  5  and 
density  =  0.10,  as  did  the  problems  just  discussed,  and  an  expected  tightness 
=  0.42.)  The  first  25  problems  generated  that  either  had  a  complete  solution  or 
did  not  formed  the  two  groups  in  this  case.  These,  of  course,  do  not  represent 
samples  from  a  single  population,  but  use  of  the  same  parameters  for  generation 
insures  a  fair  degree  of  similarity.  Tests  with  complete  methods  showed  that  the 
problems  with  complete  solutions  had  more  optimal  solutions  on  average  (788 
thousand  versus  36,000  for  problems  without  complete  solutions);  on  the  other 
hand,  the  medians  were  much  closer  (29  and  15  thousand,  respectively).  There¬ 
fore,  one  might  expect  some  differences  in  favor  of  the  former  set  of  problems, 
i.e.,  a  more  rapid  descent  to  the  mean  optimal  distance.  But  since  this  should 
pertain  to  a  relatively  small  number  of  problems,  the  averaged  curves  should  be 
similar. 

In  fact,  anytime  curves  for  both  min-conflicts  and  min-conflicts  plus  walk 
(prob,  =  0,35)  closely  parallel  each  other  (Figure  3),  and  the  average  difference 
between  procedures  is  almost  identical  for  both  sets  of  problems  (Thus,  after 
100  sec  the  difference  is  1.70  and  1.67  violations  in  favor  of  mincon-walk,  for 
problems  with  and  without  solutions,  respectively).  In  this  experiment,  min- 
conflicts  with  the  walk  strategy  found  optimal  solutions  on  almost  every  run 
(124/125  runs  after  50  seconds  for  problems  with  solutions  and  121/125  runs  for 
problems  without  solutions). 

From  these  results  it  appears  that  min-conflicts  behaves  similarly  on  MAX- 
CSPs  and  on  problems  with  complete  solutions.  Moreover,  simple  strategies  for 
escaping  local  minima  are  quite  effective,  so  that  global  minima  can  be  obtained, 
often  in  a  short  period  of  time.  In  these  experiments  the  retry  strategy  is  su- 
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Fig.  3.  Anytime  curves  for  min-conflicts  and  mincon-walk  for  two  problem  sets  based 
on  a  single  set  of  parameters.  Each  problem  in  one  set  has  a  complete  solution;  problems 
in  the  other  set  do  not  have  complete  solutions  (mean  optimal  distance  =  1.56). 


perior  to  the  walk  strategy  for  more  difficult  problems,  especially  those  with 
greater  average  optimal  distances.  However,  as  shown  in  the  next  subsection, 
these  results  are  highly  dependent  on  the  specific  values  used  for  the  critical  pa¬ 
rameters:  for  retry,  the  number  of  reassignments  before  restarting,  and  for  walk, 
the  probability  of  making  a  random  rather  than  a  min-conflicts  reassignment. 


4.2  Parametric  Analysis  of  Mincon-Walk  and  Mincon- Retry 

The  studies  in  this  section  were  run  to  determine  the  best  parameter  values  for 
the  retry  and  walk  strategies.  The  following  procedure  was  used  to  obtain  prob¬ 
lems  across  the  entire  spectrum  of  densities  while  meeting  the  bcisic  requirement 
of  knowing  the  optimal  distance  for  every  problem  tested.  Values  were  chosen 
for  expected  density  that  covered  most  of  the  range,  from  0.10  to  0.90.  In  each 
case,  a  value  was  chosen  for  the  expected  tightness  of  a  constraint  that  gave 
problems  with  the  same  average  optimal  distance.  This  entailed  decreasing  the 
expected  tightness  with  each  increase  in  density.  (If  a  single  value  is  used  for 
the  tightness,  optimal  distance  increases  dramatically  with  increasing  density, 
as  in  [13],  and  as  shown  there  and  elsewhere  (e.g.,[10]),  problems  with  greater 
optimal  distances  are  much  harder  to  solve  with  complete  methods.)  In  addition, 
the  effect  of  optimal  distance  was  tested  by  generating  sets  of  problems  having 
the  same  densities  but  with  different  distances.  In  this  way,  the  effect  of  each 
problem  parameter  could  be  separated. 

All  problems  in  this  study  had  30  variables.  In  the  main  series  of  problem 
sets,  the  domain  size  was  5.  Five  values  were  chosen  for  expected  density:  0.1, 
0-3,  0.5,  0.7  and  0.9.  In  addition  there  were  two  limited  and  widely  separated 
ranges  of  average  optimal  distance:  1.52-2.12  and  8.64-8.68.  For  brevity,  these 
are  referred  to  as  problems  with  distance  2  and  distance  8.  To  obtain  prob¬ 
lems  with  a  particular  average  optimal  distance,  expected  tightness  was  varied, 
and  25  problems  were  generated  for  each  parameter  value,  all  of  which  were 
solved  to  completion.  This  process  continued  until  a  sample  was  obtained  for  a 
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given  density  with  the  desired  average  distance.  For  the  lower  optimal  distance, 
problems  were  generated  at  all  five  densities.  For  the  higher  optimal  distance, 
problems  were  generated  at  the  three  lower  densities,  since  at  higher  densities 
problems  were  too  difficult  to  solve  with  complete  methods.  (Note.  These  eight 
problem  sets  included  the  four  used  in  the  first  experiments  described  in  the  last 
subsection.) 


Table  1.  Average  time  (seconds)  required  to  find  an  optimal  solution  for  different 
random  problem  classes  and  walk  probabilities.  Problems  with  distance  2. 

walk  probability 


0.05 

0.10  0.15 

0,25 

0.35 

density 

= 

0.10 

M 

4 

2 

2 

5 

26 

SD 

8 

3 

3 

8 

62 

density 

= 

0.30 

M 

7 

4 

5 

17 

157 

SD 

10 

6 

7 

40 

378 

density 

= 

0.50 

M 

15 

7 

7 

19 

SD 

31 

11 

9 

27 

density 

- 

0.70 

M 

11 

5 

6 

23 

SD 

25 

8 

7 

37 

density 

= 

0.90 

M 

28 

11 

9 

30 

SD 

87 

19  12 

46 

Notes.  M  is  mean,  SD  is  standard  deviation.  30-variable  problems  with  mean  optimal 
distance  1.5-2. 


After  the  requisite  problems  had  been  collected  and  their  optimal  distances 
verified,  this  information  was  used  in  more  systematic  experiments  with  heuristic 
methods,  to  determine  average  time  and  number  of  constraint  checks  needed  to 
find  an  optimal  solution.  In  these  experiments  five  values  for  the  walk  probability 
were  tested:  0.05,  0.10,  0.15,  0.25,  and  0.35.  Eight  different  values  were  tested 
for  the  retry  parameter  (number  of  new  assignments  before  restarting):  50,  100, 
150,  250,  500,  1000,  2500,  and  5000.  Each  set  of  25  problems  was  run  ten  times 
with  each  parameter  value.  The  mean  of  these  250  runs  is  the  statistic  shown  in 
the  tables  in  this  section. 

Tables  1  and  2  show  results  for  walk  probabilities,  the  first  table  for  problems 
with  distance  2,  the  second  for  problems  with  distance  8-  Table  3  shows  a  more 
limited  set  of  results  for  the  retry  parameter,  for  problems  with  distance  2.  For 
either  strategy  and  for  each  set  of  problems  there  is  a  curvilinear  relation  between 
parameter  value  and  efficiency.  For  the  walk  strategy,  the  best  probability  values 
are  0.10-0.15  when  the  average  optimal  distance  is  2.  For  problems  with  greater 
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Table  2.  Average  time  (seconds)  required  to  find  an  optimal  solution  for  different 
classes  of  random  problems  and  walk  probabilities.  Problems  with  distance  8. 

walk  probability 


0.05 

0.10 

0.15 

0.25 

M 

3 

density 

3 

=  0.10 

7 

76 

SD 

5 

4 

11 

232 

M 

7 

density 

6 

=  0.30 

13 

243 

SD 

8 

8 

18  “ 

468 

M 

20 

density 

12 

=  0.50 

22 

367 

SD 

40 

16 

33 

616 

optimal  distances,  the  performance  curve  is  shifted  downward,  with  best  results 
for  probability  0.10  and  next-best  at  0.05.  In  both  cases  the  performance  ranking 
is  the  same  throughout  the  range  of  densities  tested.  However,  with  increasing 
density,  differences  in  performance  are  accentuated,  since  effort  increases  more 
rapidly  with  poorer  parameter  values. 

The  likely  explanation  for  poorer  results  with  high  walk  probabilities  or  low 
restart  values  is  that  search  is  cut  off  before  a  promising  part  of  the  solution  space 
has  been  explored  sufficiently.  On  the  other  hand,  when  the  strategy  is  too  con¬ 
servative  (low  walk  probability  or  high  restart  value),  more  time  than  necessary 
is  spent  in  a  particular  part  of  the  search  space,  so  efficiency  is  diminished.  It 
is  not  clear  why  the  best  walk  probability  is  lower  when  the  optimal  distance  is 
greater;  this  may  be  due  the  greater  number  of  combinations  of  constraint  vio¬ 
lations  with  tighter  constraints.  In  addition,  problem  difficulty  increases  for  all 
settings  with  increased  density,  even  with  average  optimal  distance  controlled; 
this  may  simply  be  due  to  the  number  of  constraints  that  must  be  checked  at 
each  step. 


Table  3.  Average  time  (seconds)  required  to  find  an  optimal  solution  for  different 
classes  of  random  problems  and  retry  parameter  values.  Problems  with  distance  2. 

tries  before  restarting 

_ 50  100  150  250  500  1000  2500 

density  =  0.10 


M 

62 

15 

12 

12 

17 

30 

65 

SD 

238 

35 

34 

33 

62 

98 

222 

density  = 

0.50 

M 

79 

60 

40 

40 

54 

137 

SD 

212 

164 

85 

78 

102 

255 
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In  this  extended  study  of  walk  and  retry  strategies,  the  best  walk  probability 
gave  results  that  were  clearly  better  than  the  best  retry  strategy,  by  a  factor 
of  5-6.  From  these  results  we  can  see  that  the  original  walk  probability  of  0.35 
was  much  too  high  for  these  problems,  while  the  retry  parameter  value  of  150 
was  close  to  the  best.  In  addition,  there  is  some  evidence  that  the  best  value 
for  walk  probability  varies  less  as  a  function  of  problem  characteristics  than  the 
best  value  for  assignments  before  restarting. 


Table  4.  Average  number  of  optimal  solutions  per  problem  for  problems  with  different 
densities  and  average  optimal  distance. 


dist  «  2  dist  «  8 


M 

density  =  0.10 
32,954 

733 

SD 

139,080 

1385 

M 

density  =  0.30 
203 

52 

SD 

225 

59 

M 

density  =  0.50 
375 

SD 

762 

M 

density  =  0.70 
375 

SD 

762 

M 

density  =  0.90 
64 

SD 

128 

When  the  optimal  distance  is  used  as  the  initial  upper  bound,  branch  and 
bound  can  find  all  optimal  solutions  to  problems  like  these  with  considerable 
efficiency.  This  allows  one  to  count  the  number  of  optimal  solutions  for  these 
problems.  Table  4  shows  the  average  number  of  optimal  solutions  for  problems 
with  different  expected  density  and  average  optimal  distance.  These  results  can 
be  compared  with  the  performance  of  min-conflicts  on  these  problems.  The  most 
important  finding  is  that,  for  walk  probabilities  at  least,  a  large  decrease  in  the 
number  of  optimal  solutions  (from  density  =  0.1  to  density  =  0.3,  and  from 
distance  2  to  distance  8)  has  only  small  effects  on  performance  when  the  best 
parameter  values  are  used. 

Geometric  problems  also  had  30  variables,  a  density  of  about  0.10,  and  ex¬ 
pected  tightnesses  that  gave  distances  similar  to  those  of  the  random  problems. 
The  analysis  of  walk  probabilites  gave  results  that  were  similar  to  those  with 
random  problems  (Table  5).  The  slightly  shorter  mean  time  to  find  an  optimal 
solution  for  distance  2  problems  may  be  due  to  the  large  number  of  optimal 
solutions  in  these  problems  (a  mean  of  728  thousand).  Again,  the  marked  reduc- 
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Table  5.  Average  time  (seconds)  required  to  find  an  optimal  solution  for  different 
classes  of  geometric  problems  and  walk  probabilities. 

walk  probability 


0.05 

0.10 

0.15 

0.25 

M 

3 

mn.  distance 
2 

=  2.56 

2 

3 

SD 

8 

4 

5 

7 

M 

7 

mn.  distance 
5 

=  8.28 

11 

50 

SD 

19 

16 

47 

216 

tion  in  number  of  optimal  solutions  for  problems  with  higher  optimal  distance 
(a  mean  of  6811  optimal  solutions  for  problems  with  distance  8)  is  accompanied 
by  only  a  small  increase  in  run  time  for  the  best  walk  probabilites,  while  there 
is  a  much  greater  increase  for  probabilites  elsewhere  in  the  range. 

4.3  Comparisons  among  hill- climbing  procedures 

In  recent  years  a  number  of  improvements  have  been  proposed  for  hill-climbing 
search  for  CSPs,  including  the  procedures  outlined  in  Section  2.  In  evaluat¬ 
ing  these  new  procedures,  the  authors  have  usually  compared  them  with  min- 
conflicts  in  its  basic  form.  In  the  present  subsection,  comparisons  are  made  both 
with  the  basic  min-conflicts  and  with  mincon-walk,  using  the  best  value  for  the 
walk  probability  found  in  the  studies  recounted  above.  These  tests  were  done 
using  the  set  of  30-variable  PI  problems  having  an  expected  density  of  0,50  and 
an  average  distance  of  8.64.  These  were  chosen  as  somewhat  more  difficult  prob¬ 
lems  among  the  30-variable  series  tested  (cf.  Table  2).  Data  were  also  collected 
for  the  problems  with  distance  2  and  expected  density  =  0.10,  which  corroborate 
the  findings  presented  here. 

Comparisons  of  both  versions  of  min-conflicts  with  GSAT  for  CSPs,  break¬ 
out,  EFLOP  and  weak  commitment  search  are  shown  in  Figure  4.  (The  curves  for 
min-conflicts  are  the  same  in  both  4a  and  4b).  Consistent  with  earlier  demonstra¬ 
tions  (see  papers  cited  in  Section  2),  each  of  these  four  hill-climbing  procedures 
is  superior  to  the  basic  version  of  min-conflicts,  which  indicates  that  they  are 
effective  in  escaping  from  local  minima,  and  optimal  solutions  are  found  on  most 
runs  before  the  100-second  cutoff.  GSAT,  however,  lags  behind  the  other  three, 
presumably  because  of  the  amount  of  checking  involved  at  each  step.  (The  fail¬ 
ure  of  GSAT  to  outperform  procedures  that  do  not  try  to  find  the  best  move  at 
each  step  corroborates  results  with  SAT  [4].)  However,  when  the  random  walk 
strategy  is  added  to  min-conflicts,  the  resulting  procedure  does  about  as  well  as 
any  of  the  others.  In  fact,  it  is  the  only  one  to  find  optimal  solutions  on  all  runs 
within  100  seconds. 

Results  with  simulated  annealing  and  tabu  search,  using  parameter  values 
suggested  from  the  literature  (e.g.,  a  tabu  list  of  length  7  and  and  candidate 
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Fig.  4.  Averaged  anytime  curves  for  heuristic  repair  algorithms.  30-variable  problems, 
exp.  density  =  0.50,  exp.  tightness  =  0.25,  domain  size  =  5.  In  figure  b.  the  curve  for 
EFLOP  is  slightly  higher  than  that  for  weak  commitment  at  the  end  of  the  run. 


list  of  length  10  or  20)  did  not  improve  on  min-conflicts  with  walk  for  these 
problems.  However,  since  these  procedures  can  be  parameterized  in  a  variety  of 
ways  (and  since  the  importance  of  good  parameter  values  is  well-demonstrated 
in  the  present  work),  these  results  must  be  considered  tentative. 


4.4  Results  for  larger  problems 

Further  experiments  were  done  with  100- variable  PI  problems  with  expected 
density  =  0.10,  domain  size  fixed  at  5,  and  expected  tightness  =  0.25.  These 
problems  have  not  been  solved  to  optimality  with  complete  methods,  so  the 
average  optimal  distance  is  not  known.  It  is,  of  course,  still  possible  to  make 
comparisons  between  procedures.  For  min-conflicts  with  walk  probability  =  0.05, 
the  average  best  distance  after  500  seconds  was  9.56.  Based  on  results  with 
smaller  problems  generated  in  the  same  fashion,  this  is  probably  close  to  the 
optimum. 

Figure  5a  shows  performance  over  time  for  three  versions  of  min-conflicts. 
The  walk  and  retry  results  are  each  part  of  a  series:  walk  probabilities  of  0.10, 
0.15  and  0.25  and  retries  after  every  250,  500,  1000,  5000  and  25,000  assignments 
were  also  tested.  The  parameter  values  shown  in  the  figure  were  the  best  that 
were  found  in  each  series,  (Note  that  the  best  value  for  walk  probability  is  close 
to  the  best  values  found  with  30-variabie  problems,  while  the  best  value  for 
the  retry  parameter  is  much  greater  than  the  best  value  found  with  the  smaller 
problems.)  In  a  further  test,  the  best  walk  and  retry  values  were  combined,  but 
this  did  not  improve  on  the  walk  probability  shown  here. 

Figure  5b  shows  a  comparison  of  mincon-walk  with  the  two  procedures  that 
gave  best  results  on  the  30— variable  problems.  For  these  larger  problems  min- 
conflicts  augmented  with  the  simple  walk  strategy  does  as  well  or  better  than 
the  other  procedures  that  were  designed  to  enhance  hill-climbing  search. 
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Fig.  5.  Averaged  anytime  curves  for  heuristic  repair  algorithms.  100-variable  problems, 
exp.  density  =  0.10,  exp.  tightness  =  0.25,  domain  size  =  5. 


5  Conclusions 

By  combining  branch  and  bound  and  heuristic  methods  within  a  single  experi¬ 
mental  paradigm,  it  has  been  possible  to  analyze  the  effectiveness  of  local  search 
techniques  for  PCSPs  with  some  precision.  Although  most  of  the  work  was  done 
on  random  problems,  the  results  appear  to  have  some  degree  of  generality,  as 
indicated  by  findings  with  geometric  problems.  Obviously,  this  approach  can  be 
extended  to  other  types  of  problems  that  are  small  enough  to  be  solved  by  com¬ 
plete  methods.  The  results  of  the  last  section  also  show  that  extensions  to  larger 
problems  can  be  partly  supported  by  more  precise  analysis  of  similar  smaller 
problems. 

This  work  provides  a  partial  answer  to  the  question  posed  in  the  Introduc¬ 
tion.  While  the  basic  min-conflicts  procedure  does  not  usually  find  an  optimal 
solution  to  a  MAX-CSP,  variants  which  introduce  an  element  of  randomization 
can  find  optimal  solutions  with  great  efficiency,  at  least  for  small  to  medium-sized 
problems.  The  effectiveness  of  these  methods  does  depend  on  proper  settings  of 
parameter  values  in  the  procedure.  However,  for  the  most  effective  strategy  the 
best  settings  were  always  within  a  fairly  restricted  range.  These  results  also  indi¬ 
cate  that  the  random  walk  strategy  is  superior  to  a  repeated  restart  strategy.  In 
addition,  there  was  no  evidence  of  improvement  when  restarting  was  combined 
with  the  walk  strategy. 

The  comparisons  of  Sections  4.3  and  4.4  raise  some  interesting  questions 
about  the  various  enhancements  of  hill-climbing  that  have  been  proposed  in  the 
past  few  years.  All  of  them  are  successful  in  escaping  from  local  minima.  But  it  is 
not  clear  whether  they  offer  any  benefits  other  than  those  gained  by  using  a  sim¬ 
ple  random  walk  strategy,  which  is  probably  less  expensive.  In  some  cases,  such 
as  EFLOP,  this  strategy  may  be  the  critical  feature  in  the  procedure.  (EFLOP- 
ing  always  begins  with  a  random  assignment  to  a  variable  in  conflict.)  Other 
procedures,  such  as  GSAT  and  weak  commitment,  use  the  restarting  strategy, 
which  appears  to  be  inferior,  at  least  for  PCSPs.  GSAT  also  uses  a  strategy 
of  finding  the  best  move  to  make,  which  is  very  expensive  and  which  does  not 
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appear  to  pay  off  with  appreciably  better  moves.  Together,  these  results  show 
how  important  it  is  to  critically  examine  the  features  of  a  procedure,  as  done 
here  and  in  the  earlier  work  of  [4]. 
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Abstract.  This  paper  presents  an  initial  report  on  an  innovative  ap- 
proeich  for  solving  satisfiability  problems  (SAT),  i.e.,  creating  a  logic 
circuit  that  is  specialized  to  solve  each  problem  instance  on  Field  Pro¬ 
grammable  Gate  Arrays  (FPGAs).  Until  quite  recently,  this  approach 
was  unrealistic  since  creating  special-purpose  hardware  was  very  expen¬ 
sive  and  time  consuming.  However,  recent  advances  in  FPGA  technolo¬ 
gies  and  automatic  logic  synthesis  technologies  have  enabled  users  to 
rapidly  create  special-purpose  hardware  by  themselves. 

This  approach  brings  a  new  dimension  to  SAT  algorithms,  since  all  con¬ 
straints  (clauses)  can  be  checked  simultaneously  using  a  logic  circuit. 
We  develop  a  new  algorithm  called  parallel- checking,  which  assigns  all 
variable  values  simultaneously,  and  checks  all  constraints  concurrently. 
Simulation  results  show  that  the  order  of  the  search  tree  size  in  this 
algorithm  is  approximately  the  same  as  that  in  the  Davis-Putnam  pro¬ 
cedure.  Then,  we  show  how  the  parallel-checking  algorithm  can  be  im¬ 
plemented  on  FPGAs.  Currently,  actual  implementation  is  under  way. 
We  get  promising  initial  results  which  indicate  that  we  can  implement  a 
hard  random  3-SAT  problem  with  300  variables,  and  rim  the  logic  cir¬ 
cuit  at  clock  rates  of  about  IMHz,  i.e.,  it  can  check  one  million  states 
per  second. 


1  Introduction 

A  constraint  satisfaction  problem  (CSP)  is  a  general  framework  that  can  formal¬ 
ize  various  problems  in  Artificial  Intelligence,  and  many  theoretical  and  exper¬ 
imental  studies  have  been  performed  [9].  In  particular,  a  satisfiability  problem 
for  propositional  formulas  in  conjunctive  normal  form  (SAT)  is  an  important 
subclass  of  CSP.  This  problem  was  the  first  computational  task  shown  to  be 
NP-hard  [3]. 

Virtually  all  existing  SAT  algorithms  are  intended  to  be  executed  on  general- 
purpose  sequential/parallel  computers.  As  far  as  the  authors  know,  there  has 
been  no  study  on  solving  SAT  problems  by  creating  a  logic  circuit  specialized  to 
solve  each  problem  instance.  This  is  because  until  quite  recently,  creating  special- 
purpose  hardware  was  very  expensive  and  time  consuming.  Therefore,  making 
a  logic  circuit  for  each  problem  instance  was  not  realistic  at  all.  However,  due 
to  recent  advances  in  Field  Programmable  Gate  Array  (FPGA)  technologies  [1], 
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users  can  now  create  logic  circuits  by  themselves,  and  reconfigure  them  elec¬ 
tronically,  without  any  help  from  LSI  vendors.  Furthermore,  by  using  current 
automatic  logic  synthesis  technologies  [2],  [11],  users  are  able  to  design  logic 
circuits  automatically  using  a  high  level  hardware  description  language  (HDL). 
These  recent  hardware  technologies  have  enabled  users  to  rapidly  create  logic 
circuits  specialized  to  solve  each  problem  instance. 

In  this  paper ,  we  present  an  initial  report  on  an  innovative  approach  for  solv¬ 
ing  SAT,  i.e.,  creating  a  logic  circuit  that  is  specialized  to  solve  each  problem 
instance  using  FPGAs.  This  approach  brings  a  new  dimension  to  SAT  algo¬ 
rithms,  since  all  constraints  (clauses)  can  be  checked  simultaneously  using  a 
logic  circuit. 

We  develop  a  new  algorithm  called  the  parallel- checking  algorithm.  This  al¬ 
gorithm  has  the  following  characteristics. 

-  Instead  of  determining  variable  values  sequentially,  all  variable  values  are  de¬ 
termined  simultaneously,  and  all  constraints  are  checked  concurrently.  Mul¬ 
tiple  variable  values  can  be  changed  simultaneously  when  some  constraints 
are  not  satisfied. 

—  In  order  to  prune  the  search  space,  this  algorithm  introduces  a  technique 
similar  to  forward  checking  [8]. 

Simulation  results  show  that  the  order  of  the  search  tree  size  in  this  algorithm 
is  approximately  the  same  as  that  in  the  Davis-Putnam  procedure  [5],  which  is 
widely  used  as  a  complete  search  algorithm  for  solving  SAT  problems. 

Then,  we  show  how  the  parallel-checking  algorithm  can  be  implemented  on 
FPGAs  by  using  recent  hardware  technologies.  Currently,  actual  implementa¬ 
tion  is  under  way.  We  get  promising  initial  results  which  indicate  that  we  can 
implement  a  hard  random  3-SAT  problem  with  300  variables,  and  run  the  logic 
circuit  at  clock  rates  of  about  IMHz,  i.e.,  it  can  check  one  million  states  per 
second. 

In  the  remainder  of  this  paper,  we  briefly  describe  the  problem  definition  (Sec¬ 
tion  2),  and  describe  the  parallel-checking  algorithm  in  detail  (Section  3).  Then, 
we  show  simulation  results  for  evaluating  the  search  tree  size  of  this  algorithm 
(Section  4).  Furthermore,  we  show  the  way  for  implementing  this  algorithm  on 
FPGAs  and  describe  the  status  of  the  current  implementation  (Section  5).  Fi¬ 
nally,  we  discuss  the  relation  of  this  algorithm  with  recently  developed  algorithms 
W?  [7],  [6]  that  improve  the  Davis-Putnam  procedure  (Section  6). 

2  Problem  Definition 

A  satisfiability  problem  for  propositional  formulas  in  conjunctive  normal  form 
(SAT)  can  be  defined  as  follows.  A  boolean  variable  Xi  is  a  variable  that  takes 
the  value  true  or  false  (represented  as  1  or  0,  respectively).  In  this  paper,  in 
order  to  simplify  the  algorithm  description,  we  represent  the  fact  that  Xi  is 
true  as  (a^i,  1).  We  call  the  value  assignment  of  one  variable  a  literal.  A  clause 
is  a  disjunction  of  literals,  e.g.,  (xi,l)  V  (0:2, 0)  V  (2:4,1),  which  represents  a 
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logical  formula  xi  V  0:4.  Given  a  set  of  clauses  Ci,  C2, . . . ,  Cm  and  variables 
xi,  X2, . . .  the  satisfiability  problem  is  to  determine  if  the  formula 

C\  A  C2  A  ...  A  Cm 

is  satisfiable.  That  is,  is  there  an  assignment  of  values  to  the  variables  so  that 
the  above  formula  is  true. 

In  this  paper,  if  the  formula  is  satisfiable,  we  assume  that  we  need  to  find  all  or 
a  fixed  number  of  solutions,  i.e.,  the  combinations  of  variable  values  that  satisfy 
the  formula.  Most  of  the  existing  algorithms  for  solving  SAT  aim  to  find  only  one 
solution.  Although  this  setting  corresponds  to  the  original  problem  definition, 
some  application  problems,  such  as  visual  interpretation  tasks  [14],  and  diagnosis 
tasks  [13],  require  finding  all  or  multiple  solutions.  Furthermore,  since  finding  all 
or  multiple  solutions  is  usually  much  more  difficult  than  finding  only  one  solution, 
solving  the  problem  by  special-purpose  hardware  will  be  worthwhile.  Therefore, 
in  this  paper,  we  assume  that  the  goal  is  to  find  all  or  multiple  solutions. 

In  the  following,  for  simplicity,  we  restrict  our  attention  to  3-SAT  problems, 
i.e.,  the  number  of  literals  in  each  clause  is  3.  Relaxing  this  assumption  is  rather 
straightforward. 


3  Algorithm 

3.1  Basic  Ideas 

We  are  going  to  describe  the  basic  ideas  of  the  parallel-checking  algorithm.  This 
algorithm  is  obtained  by  gradually  improving  a  simple  enumeration  algorithm. 


Simple  Enumeration  Algorithm:  We  represent  one  combination  of  value  as¬ 
signments  of  all  variables  as  n-digit  binary  value.  Assuming  that  variable  x^’s 
value  is  Vi,  a  combination  of  value  assignments  can  be  represented  by  an  n-digit 
binary  value  rjLi2*"^Vi,  in  which  the  value  of  z’s  digit  (counted  from  the  lowest 
digit)  represents  the  value  of  Xi.  We  call  one  combination  of  all  variable  val¬ 
ues  one  state.  In  this  algorithm,  the  state  is  incremented  from  0  to  2”  -  1.  For 
each  state,  the  algorithm  checks  whether  clauses  are  satisfied.  If  all  clauses  are 
satisfied,  the  state  is  recorded  as  a  solution.  Obviously,  this  algorithm  is  very 
inefficient  since  it  must  check  all  2”  states. 


Introducing  Backtracking:  When  some  clauses  are  not  satisfied,  instead  of 
incrementing  Xi’s  digit,  we  can  increment  the  lowest  digit  that  is  included  in 
these  unsatisfied  clauses;  thus  the  number  of  searched  states  can  be  reduced.  The 
algorithm  obtained  after  this  improvement  is  very  similar  to  the  backtracking 
algorithm  where  the  order  of  the  variable/ value  selection  is  fixed. 
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Introducing  Forward  Checking  (i):  Furthermore,  instead  of  checking  the 
current  value  (0  or  1)  only,  if  we  check  another  value  concurrently,  we  can  reduce 
the  number  of  searched  states.  For  example,  assume  that  there  exist  variables 
and  three  clauses: 

Ci:  (a:i,l)  V  (x4,l)  V  (xg,!), 

C2:  (a:i,0)  V(a:3,l)  V(x4,l), 

Cs:  (xi,0)  V  (x2, 1)  V  (0:5, 1). 

The  initial  state  {(xi,  0),  (x2, 0),  (xa,  0),  (X4, 0),  (xs,  0)}  does  not  satisfy  Ci.  If  we 
increment  Xi’s  digit  and  change  Xi’s  value  to  1,  then  C2  and  C3  are  not  satisfied. 
If  we  perform  the  check  for  the  case  that  Xi’s  value  is  1,  we  can  confirm  that 
incrementing  xi ’s  digit  is  useless. 

In  this  case,  which  digit  should  be  incremented?  If  xi  is  0,  Ci  is  not  satisfied, 
and  the  second  lowest  digit  in  Ci  is  X4.  If  Xi  is  1,  C2  and  Cs  are  not  satisfied, 
and  the  second  lowest  digit  in  C2  is  X3,  while  the  second  lowest  digit  in  C3  is  X2. 
Therefore,  we  can  conclude  that  at  least  xa’s  digit  must  be  changed  to  satisfy 
all  clauses;  changing  digits  lower  than  X3  is  useless. 

This  procedure  is  similar  to  the  backtracking  algorithm  that  introduces  for¬ 
ward  checking  [8],  where  backtracking  is  performed  immediately  after  some  vari¬ 
able  has  no  consistent  value  with  the  variables  that  have  already  assigned  their 
values. 


Introducing  Forward  Checking  (ii):  Another  procedure  that  greatly  con¬ 
tributes  to  the  efficiency  of  forward  checking  is  to  assign  the  variable  value 
immediately  if  the  variable  has  only  one  value  consistent  with  the  variables  that 
have  already  assigned  their  values.  This  procedure  is  called  unit  resolution  in 
SAT. 

In  order  to  perform  a  similar  procedure  in  this  algorithm,  for  each  variable 
Xi,  we  define  a  value  called  unit(xi).  If  unit(xi)=j,  there  exists  only  one  possible 
value  for  Xi,  which  is  consistent  with  the  upper  digit  variables^,  and  the  second 
lowest  digit  in  the  clause  that  is  constraining  Xi  is  x^’s  digit. 

For  example,  in  the  initial  state  {(xi,  0),  (x2, 0),  (X3, 0),  (x4, 0),  (X5, 0)}  of  the 
problem  described  in  3.1,  xi  has  only  one  consistent  value  1  by  Ci.  There¬ 
fore,  we  set  unit(xi)  to  4  (since  X4  is  the  second  lowest  digit  in  Ci),  and 
change  xi’s  value  to  1.  The  value  of  unit(xi)  represents  the  fact  that  unless 
at  least  X4’s  digit  is  changed,  xi  has  only  one  possible  value.  The  next  state 
will  be  {(a;i,  1),  (x2,0),  (x3, 1),  (x4,0),  (x5,0)}.  This  state  does  not  satisfy  C3. 
Since  the  lowest  digit  in  C3  is  Xi,  xi’s  value  is  changed  in  the  original  proce¬ 
dure.  However,  the  value  of  unit(xi)  is  4  and  the  second  lowest  digit  in  C3  is 
X2,  where  X2  is  lower  than  X4.  Therefore,  X2’s  value  is  changed.  The  next  state 
will  be  {(xi,  1),  (x2, 1),  (x3, 1),  (X4, 0),  (xs,  0)}.  This  state  satisfies  all  of  the  three 
clauses. 

^  When  there  exist  multiple  possible  values,  unit(xt)=i. 
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3.2  Details  of  the  Algorithm 

Term  Definitions;  In  the  following,  we  define  concepts  and  terms  used  in  the 
algorithm. 


-  Each  clause  {xi^Vi)  V  {xj^Vj)  V  [xk^Vk)  is  converted  to  the  following  rule 
(where  flip(0)=:l,  flip(l)=0): 

if  {xj,mp{vj))  A  (unit(xj)  >i)A  (xfc,flip(t;fc))  A  {umt{xk)  >  i) 
(a:i,flip(ui))  is  prohibited. 

It  must  be  noted  that  one  clause  is  converted  to  three  different  rules,  since 
there  are  three  possibilities  for  choosing  the  variable  in  the  consequence  part. 

-  Each  rule  is  associated  with  the  value  flip(ui)  of  variable  Xi  in  the  conse¬ 
quence  part. 

-  We  call  a  rule  is  active  if  the  condition  is  satisfied  in  the  current  state. 

-  For  value  vi  of  variable  a:*,  if  some  rule  associated  with  Vi  is  active,  we  call 
value  Vi  is  prohibited. 

-  For  each  active  rule,  if  the  variables  in  the  condition  part  are  Xj,Xk,  we  call 
min(unit(a;j),  unit(a;fc))  the  backtrack-position  of  the  rule. 

-  If  value  Vi  of  variable  Xi  is  prohibited,  the  maximum  of  the  backtrack- 
positions  of  active  rules  associated  with  Vi  is  called  v^’s  backtrack-position. 

-  When  a  state  is  given,  we  classify  the  condition  of  variable  Xi  in  the  following 
four  cases: 

satisfied/free:  both  values  are  not  prohibited. 

satisfied/ constrained:  the  current  value  Vi  is  not  prohibited,  but  flip(vj) 
is  prohibited. 

not-satisfied/possible:  the  current  value  Vi  is  prohibited,  but  flip(ui)  is 
not  prohibited. 

not-satisfied/no-way:  both  values  are  prohibited. 

-  For  variable  Xi  which  is  not-satisfied,  we  define  the  backtrack-position  of  Xi 
as  follows: 

when  Xi  is  not-satisfied/no-way:  the  minimum  (the  lower  digit)  of  the 
backtrack-positions  of  values  0  and  1. 
when  Xi  is  not-satisfied/possible:  i  (its  own  digit). 


Parallel-Checking  Algorithm;  In  the  initial  state,  all  variable  values  are  0, 
and  the  value  of  unit(a:i)  is  i. 

1.  For  each  value  of  each  variable,  concurrently  check  whether  the  associated 
rules  are  active.  For  a  prohibited  value,  calculate  the  backtrack-position  of 
the  value.  Calculate  the  condition  and  backtrack-position  of  each  variable. 

2.  For  each  variable  Xi,  if  its  condition  is  satisfied/constrained,  set  the  value  of 
unit(xi)  to  the  backtrack-position  of  flip(t;i),  where  Vi  is  rcj’s  current  value. 
Otherwise,  set  unit(a:i)  to  i. 
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3.  Calculate  m,  which  is  the  maximum  of  the  backtrack-positions  of  not-satisfied 
variables.  If  all  variables  are  satisfied,  record  the  current  state  as  a  solution, 
and  set  m  to  1. 

4.  Calculate  max,  which  is  the  lowest  digit  that  satisfies  the  following  condi¬ 
tions:  max  >  m,  Vmax  —  0,  and  Xmax  is  satisfied/free,  where  Vmax  is  Xmax^s 
current  value.  If  there  exists  no  value  that  satisfies  these  conditions,  termi¬ 
nate  the  algorithm. 

5.  Change  the  value  of  x^ax  from  0  to  1.  For  each  variable  xi  which  is  lower 
than  Xmax^  execute  the  following  procedure: 

when  Xi  is  satisfied/constrained,  and  unit(a:i)  is  larger  than  max:  do  not 
change  Xi  nor  unit(a:i). 

when  Xi  is  not-satisfied  and  for  one  of  Xj’s  values  Vi,  u^’s  backtrack-position 
is  larger^  than  max.  set  x^’s  value  to  flip(z;i),  and  set  unit(xi)  to  Vi^s 
backtrack-position. 

otherwise:  set  x^’s  value  to  0,  and  set  unit(xi)  to  i. 

6.  Return  to  1. 

4  Simulation  Results 

In  this  section,  we  evaluate  the  efficiency  of  the  parallel-checking  algorithm  by 
software  simulation.  We  measured  the  number  of  searched  states  in  the  algo¬ 
rithm.  For  comparison,  we  used  the  Davis-Putnam  procedure  [5],  which  is  widely 
used  as  a  complete  algorithm  for  solving  SAT  problems.  The  Davis-Putnam  pro¬ 
cedure  is  essentially  a  resolution  procedure.  It  performs  backtracking  search  by 
assigning  the  variable  values  and  simplifying  clauses.  We  call  a  clause  that  is 
simplified,  such  that  it  contains  only  one  literal,  a  unit  clause.  When  a  unit 
clause  is  generated,  the  value  of  the  variable  that  is  contained  in  the  unit  clause 
is  assigned  immediately  so  that  the  unit  clause  is  satisfied.  This  procedure  is 
called  unit  resolution. 

We  use  hard  random  3-SAT  problems  as  example  problems.  Each  clause  is 
generated  by  randomly  selecting  three  variables,  and  each  of  the  variables  is  given 
the  value  0  or  1  (false  or  true)  with  a  50%  probability.  The  number  of  clauses 
divided  by  the  number  of  variables  is  called  the  clause  density,  and  the  value 
4.3  has  been  identified  as  the  critical  value  that  produces  particularly  difficult 
problems  [10]. 

In  Fig.  1,  we  show  the  log-scale  plot  of  the  average  number  of  visited  states 
over  100  example  problems,  by  varying  the  number  of  variables  n,  where  the 
clause  density  is  fixed  to  4.3.  The  number  of  visited  states  in  the  Davis-Putnam 
procedure  is  the  number  of  binary  choices  made  during  the  search;  it  does  not 
include  the  number  of  unit  resolutions.  Since  a  randomly  generated  3-SAT  prob¬ 
lem  tends  to  have  a  very  large  number  of  solutions  when  it  is  solvable,  in  order 

^  Since  m  is  the  maximum  of  the  backtrack-positions  of  all  not-satisfied  variables,  the 
backtrack-positions  of  both  values  can  not  be  larger  than  max,  which  is  larger  than 
m. 
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to  finish  the  simulation  within  a  reasonable  amount  of  time,  we  terminate  each 
execution  after  the  first  100  solutions  are  found. 

We  perform  a  simple  variable  rearrangement  before  executing  these  algo¬ 
rithms,  i.e.,  the  variables  are  rearranged  so  that  strongly  constrained  variables 
(variables  included  in  many  clauses)  are  placed  in  higher  digits,  and  the  vari¬ 
ables  that  are  related  by  constraints  (the  variables  included  in  the  same  clause) 
are  placed  as  close  as  possible.  The  Davis-Putnam  procedure  utilizes  this  rear¬ 
rangement  by  selecting  a  variable  in  the  order  from  Xn  to  xi  (except  for  unit 
resolutions). 

From  Fig.  1,  we  can  see  the  following  facts. 

—  The  number  of  visited  states  in  the  parallel-checking  algorithm  is  three  to 
eight  times  larger  than  that  in  the  Davis-Putnam  procedure.  This  result 
is  reasonable  since  the  number  of  states  in  the  Davis-Putnam  procedure 
does  not  include  the  number  of  unit  resolutions,  while  the  number  of  states 
in  the  parallel-checking  includes  state  transitions  that  are  caused  by  unit 
resolutions. 

—  The  order  of  visited  states  (the  order  of  the  search  tree  size)  in  the  parallel- 
checking  algorithm  is  approximately  the  same  as  that  in  the  Davis-Putnam 
procedure,  i.e.,  for  each  algorithm,  the  number  of  visited  states  grows  at  the 
same  rate  as  the  number  of  variables  increases. 

The  computation  executed  for  each  state  in  the  Davis-Putnam  procedure  is  in 
the  order  of  0(n)  (which  includes  repeated  applications  of  unit-resolutions).  On 
the  other  hand,  the  computation  executed  for  each  state  in  the  parallel-checking 
algorithm  can  be  finished  in  one  clock^  when  the  algorithm  is  implemented  on 
FPGAs. 

These  results  indicate  that  the  parallel-checking  algorithm  implemented  on 
FPGAs  will  be  much  more  efficient  than  the  Davis-Putnam  procedure  imple¬ 
mented  on  a  general-purpose  computer. 

5  Implementation 

In  this  section,  we  give  a  brief  description  of  FPGAs.  Then,  we  show  how  the 
parallel-checking  algorithm  can  be  implemented  on  FPGAs,  and  report  the  cur¬ 
rent  status  of  our  implementation. 


5.1  Field  Programmable  Gate  Arrays 

An  example  of  FPGA  architecture  is  shown  in  Fig.  2.  It  consists  of  a  two- 
dimensional  array  of  programmable  logic  blocks,  with  routing  channels  between 

^  The  required  time  for  one  clock  is  not  constant,  since  the  possible  clock  rate  of  a 
logic  circuit  is  determined  by  the  delay  of  the  logic  circuit,  and  the  delay  is  certainly 
affected  by  the  problem  size  n.  However,  the  order  would  be  much  smaller  than  0(n), 
i.e.,  at  most  0{log{n)). 
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Number  of  variables 


Fig.  1.  Results  for  par2dlel-checking  and  Davis-Putnam  procedure  on  hard  random 
3-SAT  problems 


these  blocks.  These  logic  blocks  and  interconnections  are  user-programmable  by 
rewriting  static  RAM  cells.  We  use  an  FPGA  hardware  system  called  ZyCAD 
RP2000  [16].  This  system  has  32  FPGA  chips  (each  chip  is  a  Xilinx  XC4010), 
and  can  implement  a  large-scale  logic  circuit  by  dividing  it  into  multiple  FPGA 
chips.  The  equivalent  gate  count  of  a  Xilinx  XC4010  is  about  8.0k  to  10.0k. 

5.2  Logic  Circuit  Configuration 

We  show  the  configuration  of  the  logic  circuit  that  implements  the  parallel¬ 
checking  algorithm  in  Fig.  3.  The  logic  circuit  consists  of  the  following  three 
functional  units. 

1.  Rule  Checker 

2.  Next  State  Generator 

3.  Next  Unit  Generator 

In  the  Rule  Checker,  the  condition  and  the  backt  rack-posit  ion  of  each  digit  are 
calculated  from  the  current  state  and  the  unit  values  in  parallel.  The  Next  State 
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Generator  first  calculates  max,  i.e.,  the  digit  that  must  be  incremented,  using 
the  outputs  of  the  Rule  Checker.  Then,  the  Next  State  Generator  calculates  the 
next  state  by  incrementing  the  digit  of  max.  The  value  of  each  lower  digit  is 
determined  by  its  condition  and  backtrack-position.  The  Next  Unit  Generator 
calculates  the  unit  values  in  the  next  state,  using  the  outputs  of  the  Rule  Checker 
and  max.  These  calculated  values  (the  next  state  and  next  unit  values)  are  used 
as  feedback  and  stored  in  registers. 

5.3  Logic  Circuit  Synthesis 

A  logic  circuit  that  solves  a  specific  SAT  problem  is  synthesized  by  the  following 
procedure  (Fig.  4).  First,  a  text  file  that  describes  a  SAT  problem  is  analyzed  by 
an  SFL  generator  written  in  the  C  language.  This  program  generates  a  behavioral 
description  specific  to  the  given  problem  with  an  HDL  called  SFL.  Then,  a  CAD 
system  analyzes  the  description  and  synthesizes  a  netlist,  which  describes  the 
logic  circuit  structure.  We  use  a  system  called  PARTHENON  [2],  [11],  which  was 
developed  at  NTT.  PARTHENON  is  a  highly  practical  system  that  integrates  a 
description  language,  simulator,  and  logic  synthesizer.  Furthermore,  the  FPGA 
Mapper  of  the  Zycad  system  generates  FPGA  mapping  data  for  RP2000  from 
the  netlist. 


next  unit 


next  state 


Fig.  4.  Flow  of  logic  circuit  synthesis 
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5.4  Current  Implementation  Status 

We  have  developed  an  SFL  generator  which  generates  a  description  of  the  logic 
circuit  that  implements  the  parallel-checking  algorithm.  By  using  this  program, 
we  have  successfully  implemented  a  hard  random  3-SAT  problem  with  128  vari¬ 
ables  and  550  clauses  using  our  current  hardware  resources.  This  logic  circuit 
is  capable  of  running  at  the  clock  rates  of  at  least  IMHz,  which  is  the  max¬ 
imal  setting  of  the  clock  generator  we  are  currently  using.  Therefore,  we  can 
assume  that  it  is  capable  of  running  at  higher  clock  rates.  The  logic  circuit  for 
a  128-variable  problem  fits  in  the  current  hardware  resources.  Since  the  number 
of  FPGA  chips  in  this  system  can  be  increased  up  to  64  (twice  as  many  as  the 
current  configuration)  and  the  mapping  quality  can  stand  further  improvement, 
we  could  possibly  implement  much  larger  problems,  e.g.,  a  problem  with  300 
variables,  without  any  trouble"*.  We  are  increasing  our  hardware  resources  and 
trying  to  implement  much  larger  problems. 

Currently,  generating  a  logic  circuit  from  a  problem  description  takes  a  few 
hours.  This  is  because  we  are  using  general-purpose  synthesis  routines.  These 
routines  can  be  highly  optimized  for  SAT  problems  since  many  parts  in  a  logic 
circuit  are  common  in  all  problem  instances.  The  required  time  for  generating  a 
logic  circuit  could  be  reduced  to  at  most  several  ten  minutes. 

6  Discussions 

Recently,  several  improved  versions  of  the  Davis-Putnam  procedure  have  been 
developed  [4],  [6],  [7].  These  algorithms  use  various  sophisticated  variable/value 
ordering  heuristics.  How  good  is  the  parallel-checking  algorithm  compared  with 
these  algorithms?  Unfortunately,  these  algorithms  aim  to  find  only  one  solution, 
and  various  procedures  for  simplifying  formulas,  such  as  removing  variable  values 
that  do  not  affect  the  satisfiability  of  the  problem,  are  introduced.  Therefore, 
the  evaluation  results  of  these  algorithms  can  not  be  compared  directly  with  the 
results  of  the  parallel-checking  algorithm.  It  is  not  very  straightforward  to  modify 
these  algorithms  so  that  they  can  find  all  solutions.  In  our  future  works,  we  are 
going  to  examine  these  algorithms  carefully,  modify  them  so  that  they  can  find 
all  solutions,  and  compare  the  modified  algorithms  with  the  parallel-checking 
algorithm.  Furthermore,  we  are  going  to  examine  the  possibility  of  introducing 
the  heuristics  used  in  these  algorithms  into  the  parallel-checking  algorithm. 

7  Conclusions  and  Future  Works 

This  paper  presented  an  initial  report  on  solving  SAT  using  FPGAs.  In  this 
approach,  a  logic  circuit  specific  to  each  problem  instance  is  created  on  FPGAs. 
This  approach  brings  a  new  dimension  to  SAT  algorithms  since  all  constraints 

Of  course,  the  efficiency  of  the  parallel-checking  algorithm  must  be  improved  in  order 

to  solve  such  a  large-scale  problem  within  a  reasonable  amount  of  time. 
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can  be  checked  in  parallel  using  a  logic  circuit.  We  developed  a  new  algorithm 
called  paralleUchecking^  which  assigns  all  variable  values  simultaneously,  and 
checks  all  constraints  concurrently.  Simulation  results  showed  that  the  order  of 
the  search  tree  size  in  the  parallel-checking  algorithm  is  approximately  the  same 
as  that  in  the  Davis-Putnam  procedure,  which  is  widely  used  as  a  complete 
algorithm  for  solving  SAT  problems.  We  have  implemented  a  hard  random  3- 
SAT  problem  with  128  variables,  and  run  the  logic  circuit  at  clock  rates  of  about 
IMHz,  i.e.,  it  can  check  one  million  states  per  second.  Currently,  we  are  increasing 
our  hardware  resources  so  that  much  larger  problems  can  be  implemented.  We 
are  going  to  perform  various  evaluations  on  implemented  logic  circuits. 

Our  future  works  include  comparing  this  approach  to  recently  developed 
algorithms  [4],  [6],  [7]  that  improve  the  Davis-Putnam  procedure,  and  introduc¬ 
ing  the  heuristics  used  in  these  algorithms  into  the  parallel-checking  algorithm. 
Furthermore,  we  are  going  to  implement  iterative  improvement  algorithms  for 
solving  SAT  [12],  [15]  on  FPGAs. 
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A  Constraint  Program 
for  Solving  the  Job-Shop  Problem 


Jianyang  Zhou^ 

Laboratoire  d’Informatique  de  Marseille 


Abstract 

In  this  paper,  a  method  within  the  framework  of  propagation  of  interval 
constraints  and  based  on  the  branch-and-bound  optimization  scheme  for  solv¬ 
ing  the  job-shop  scheduling  problem  will  be  presented.  The  goal  is  to  provide 
a  constraint  program  which  is  clean,  flexible  and  robust.  The  design  of  the 
constraint  program  is  based  on  an  idea  of  sorting  the  release  and  due  dates 
of  tasks,  which  is  a  successful  application  of  a  previous  but  not  yet  published 
work  on  a  distinct  integers  constraint.  Based  on  the  sorting  constraint,  by  as¬ 
sembling  redundant  constraints  and  applying  an  efiicient  search  strategy,  the 
current  program  for  the  job-shop  problem  can  solve  the  ten  10  x  10  instances 
in  the  paper  of  Applegate  and  Cook  (1991)  in  satisfactory  computational  time. 
Moreover,  good  results  have  been  achieved  on  some  harder  instances. 

Keywords:  Constraint  Programming,  Interval  Constraints,  Constraint  of 
Distinct  Integers,  Permutation,  Sorting,  Job-shop  Scheduling 


1  Introduction 

Given  n  jobs  each  consisting  of  m  tasks  that  have  to  be  processed  on  m  machines, 
the  job-shop  problem  requires  to  schedule  the  jobs  on  the  machines  so  as  to  minimize 
the  maximum  of  the  completion  times  of  all  jobs,  subject  to; 

•  the  order  {tij , . . . ,  Tmj)  of  the  m  machines  to  process  any  job  j  is  known 
(the  precedence  constraint); 

•  the  duration  dij  of  job  j  processed  on  machine  i  is  known 
(the  duration  constraint); 

"This  work  has  been  supported  by  the  Esprit  Project  Acclaim  n®  7195 
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•  on  any  machine,  at  any  moment,  there  can  be  at  most  one  task  processed 
(the  disjunctive  constraint). 

This  problem  is  NP-hard  in  the  strong  sense  [11].  A  famous  instance  for  it  is  the 
10  X 10  mtlO  posed  in  1963  by  Muth  and  Thompson  [13],  which  had  resisted  efforts  of 
many  researchers  for  over  20  years  before  it  was  solved  by  Carlier  and  Pinson  in  1989 
[4].  The  approach  of  Carlier  and  Pinson  is  to  settle  the  disjunctive  constraints  by 
ordering  task  pairs  with  edge-finding  search  strategy  which  concentrates  on  possible 
first  or  last  tasks  of  a  certain  set  of  tasks  on  a  certain  machine  to  establish  a  search 
tree  [4,  1,  2,  7].  To  our  knowledge,  many  successful  algorithms  for  solving  the  job- 
shop  problem  are  based  on  such  a  resolution  scheme.  In  this  paper,  we  propose 
a  new  way  based  on  the  idea  of  sorting  the  release  and  due  dates  of  jobs  to  solve 
the  job-shop  problem.  The  principal  distinction  from  the  method  of  Carlier  and 
Pinson  is:  with  the  distinct  integers  constraints  used,  the  disjunctive  constraints 
of  the  job-shop  problem  will  be  settled  more  globally  than  concentrating  only  on 
the  orders  of  task  pairs  and  a  more  global  search  strategy  oriented  by  the  distinct 
integers  constraints  will  be  applied.  The  technique  used  is  the  system  of  interval 
constraints  [3,  8,  10,  15].  The  goal  is  to  give  a  constraint  program  which  is  clean, 
flexible  and  robust. 


2  The  constraint  solving  paradigm 

2.1  Introduction 

In  1987,  Cleary  introduced  the  interval  method  into  logic  programming  for  logical 
arithmetic  [8].  This  technique  was  then  deVelopped  for  constraint  systems  at  Bell- 
Northern  Research  (BNR)  in  BNR  Prolog  [15].  As  a  new  approach  for  solving 
constraints,  though  the  interval  method  possibly  has  its  drawback:  relatively  poor 
pruning  efficiency  due  to  the  only-on-the-two-bounds  domain  reduction,  this  method 
does  not  lack  advantages:  it  provides  a  simple  and  clean  technique  for  constraint 
solving  %vhich  makes  it  promising  to  solve  constraint  systems  of  large  scale;  it  is 
a  good  candidate  for  dealing  with  continuous  problems,  and  a  good  candidate  for 
systems  of  mixed  constraints,  which  brings  about  elegant  operational  symantics.  In 
this  paper,  to  pursue  simplicity,  we  deal  with  only  intervals  of  integers. 

2.2  Notations  and  definitions 

Let  Z  be  the  set  of  all  integers.  An  inten'al  /  is  a  subset  of  Z  which  is  of  the  form 
{k  e  Z\i  <  k  <  j},  where  ij  are  elements  of  ZU  {±oo},  the  order  relation  <  being 
extended  in  the  way  that  -oo  <k<  +oo  for  all  integer  k.  Such  an  interval  I  will 
be  denoted  by  [iJ]. 

Let  V  be  a  set  of  variables.  We  will  assume  that  to  each  variable  x  of  V  an 
interval  dom{x),  called  the  domain  of  x,  is  associated.  To  simplify  our  notations  the 
upper  bound  and  lower  bound  of  doin(x)  are  denoted  respectively  by  x  and  x.  The 
cardinality  of  dom(x)  is  denoted  by  \xl  it  is  either  an  integer  or  oo. 
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A  constraSnt  over  V  is  an  expression  of  the  form  r(xi,. . .  ,Xn),  where  the 
are  distinct  \^riables  taken  from  V  and  where  r  is  an  n-ary  relation  over  Z,  that 
is  to  say  a  subset  of  Z”.  In  this  paper,  we  are  interested  in  dealing  with  16  kinds 
of  constraints  which  are  displaj'ed  in  table  1.  They  will  be  used  in  the  constraint 
program  for  solving  the  job-shop  problem. 

A  system  $  of  constraints  is  a  conjunction  of  constraints  over  V.  A  map¬ 
ping  a  from  V  to  Z  is  a  solution  of  $  itf  for  all  constraint  r{a:i, , . .  ,Xn)  of 
(cr(xi),. . .  ,(7{Xn))  6  r.  A  mapping  a'  from  subset  V'  of  V  to  Z  is  called  a  partial 
solution  of  $  iff  there  exists  a  solution  or  of  ^  such  that  a{x)  =  (/{x)  for  all  x  €  V'. 

For  any  subset  q  of  Z”,  hull(^)  denotes  the  smallest  cartesian  product  of  in¬ 
tervals  containing  q  and  proj.(^)  denotes  the  z-th  projection  of  For  almost  all 
relations  r  of  table  1  we  have  developped  a  narrowing  algorithm  that,  given  an 
n-uple  (/i, . . . ,  In)  of  inter\^als,  computes  perfectly  hull(r  H  /j  x  •  •  •  x  /„).  Special 
attention  has  been  devoted  to  the  distinct  integers  relation  involved  in  constraint 
xi  ^  ...  ^  Xn-  The  narrowing  algorithms  are  used  to  remove  inconsistent  values 
from  variable  domains  by  computing  hull(r  n  dom(xi)  x*  •  •  xdoin(j:„))  for  constraint 


Table  1:  16  primitive  constraints 


Notation  for  r(rci, . . . ,  Xn) 

Relation  r,  viewed  as  a  set  of  n-uples  of  integers 

^1  <  ^2 

{{Pi^P2)  \Pi  is  less  than  or  equal  to  P2} 

^3  =  -1-  ^2 

{(pi,p2,p3)  |p3  is  the  sum  of  pi  and  P2} 

xiGD 

{Pi  1  Pi  is  element  of 

r 

III 

H 

{(Pl,P2)  1  (P2  =  0  Api  =  1)  V  0)2  =  1  Api  =  0)} 

Xi  =>  X2 

{(Pl,P2)  |pi  =  0  V  (pj  =  1  AP2  =  1)} 

X2  =  (xi  6  D) 

{{P1,P2)  1  (P2  =  0  Api  ^  V  (P2  =  1  A  Xi  e  D)} 

Xs  =  {xi  =  X2) 

{(Pl?P2,P3)  1  (P3  =  0  Api  P2)  V  (p3  =  1  Api  =  P2)} 

x^  =  (ari  <  X2) 

{iPl,P2,P3)  1  (P3  =  0  Api  >  Pi)  V  (P3  =  1  Api  <  P2)} 

^3  =  (^1  <  X2) 

{(Pl,P2,P3)  1  (P3  =  0  Api  >  P2)  V  {P3  =  1  Api  <  P2)} 

X4  =  if  xi  then  X2  else  X3 

{(P17P2,P3,P4)  1  (Pl=0  Ap4=p3)  V  (pi  =  l  AP4=P2)} 

Xn  =  minJ^J  Xi 

{{Pi,  •  •  •  ,Pn)  1  Pn  is  the  minimum  of  pi, . . .  ,Pn-i} 

Xn  —  maxj^l  Xi 

{(pi J  •  •  ■  ^Pn)  1  Pn  is  the  maximum  of  pi, . . .  ,p„_i} 

1  ^i 

{(Pi5  •  •  ■  ,Pn)  |Pn  is  the  summation  of  pi, . . .  ,Pn^i} 

Xi^  ^  Xn 

{(Pi,  •  •  •  ,Pn)  1  {i  #  j)  {xi  ^  Xj),  for  all  ij} 

Xn  =  sortfc(xi,...,x„_i) 

{(Pi,  •  •  ■  ,Pn)  |pn  =  p'k  where 

(p1>  •  -  •  .Pn-i)  is  sorting  of  (pi, . . .  ,Pn_i)} 

Xn  e 

{(Pi?  •  •  •  5pn)  1  there  exist  k  distinct  indices  Zi, . . . ,  4 
from  1  to  n-1  such  that  p„  =  p,-,  H - l-pu.} 

Here  D  denotes  a  finite  subset  of  Z  and  k  a  positive  integer. 


2.3  The  propagation  of  constraints 

Consistency  techniques  are  widely  used  in  artificial  intelligence  for  solving  con¬ 
straint  satisfaction  problems.  In  the  case  of  interval  constraints,  the  propagation 
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technique  as  specified  in  [3],  which  can  be  traced  back  to  the  work  of  Mackworth 
(arc-consistency  [12]),  will  be  applied  to  reduce  the  domains  of  the  variables.  Given 
a  system  of  constraints,  with  a  first-in-first-out  queue  for  constraints  which  is  in 
the  beginning  filled  with  all  constraints  of  the  system,  the  following  propagation 
algorithm  is  used.^ 

while  3  r(xi,.,.,Xn)  in  queue  do 

P  :=  hull(dom{:ri)  x  •  •  •  x  doni(a:„)  H  r) 
if  p  —  {Jj  then 

stop  (no  solution  for  the  system) 

else 

for  i  :=  1  to  n  do 

if  pro 2  i{P)  ^  dom{x i)  then 

queue  :=  queue  U  {constraints  over  x,} 
dom(a:,)  :=  proj,(P) 
endif 
endfor 
endif 

queue  ;=  queue  \  r(ri, . . .  ,Xn) 
end  while 

2.4  The  nondeterministic  search 

The  constraint  solving  paradigm  adopted  by  us  is  based  on  the  hybrid  algorithm 
of  propagating  constraints  and  a  simple  search  scheme:  each  time  the  propagation 
of  constraints  terminates,  if  solution  (or  partial  solution)  is  not  reached,  a  simple 
enumeration  scheme  will  be  used  to  search  for  solutions.  The  idea  is  to  split  the 
domain  of  some  variable  into  two  sub-intervals  and  deal  with  the  subproblems  corre¬ 
sponding  to  these  two  parts  respectively.  As  the  constraint  propagation  (narrowing 
of  intervals)  and  the  branching  (splitting  of  intervals)  go  on,  the  system  wdll  either 
reach  some  solution  or  prove  no  solution  to  the  sub-problem  (due  to  the  finiteness  of 
variable  domains).  In  the  latter  case,  the  SA^stem  backtracks  to  search  other  branches 
for  solutions. 

2.5  The  optimization 

For  optimization,  the  depth-first  branch-and-bound  strategy  is  used:  each  time  a 
solution  is  found,  a  new  upper  bound  (for  minimization)  or  lov^r  bound  (for  max¬ 
imization)  on  the  solutions  which  is  better  (to  the  extent  as  the  case  should  be) 
will  be  imposed.  In  the  following,  only  better  solutions  are  searched  and  thus  the 
solution  finally  gotten  is  optimal. 


^A.s  pointed  out  in  [3],  the  algorithm  trivially  terminates 
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3  The  sorting  constraint 

In  conventional  procedural  programming,  sorting  is  an  important  issue.  So  is  it 
in  constraint  programming:  as  will  be  seen  in  the  resolution  of  the  job-shop  prob¬ 
lem,  the  sorting  constraint,  which  requires  that  n  integers  j/i, . . . ,  are  sorting  of 
Xi,...,Xn  in  ascending  order,  plays  a  main  role. 

Based  on  the  same  principle,  the  authors  of  [14]  discussed  in  general  terms  this 
constraint  {where  they  call  it  sortedness  constraint)  and  its  potential  applications 
in  scheduling  problems.  In  a  different  spirit,  we  deal  with  this  issue  by  giving  a 
set  of  primitive  constraints  for  it  and  use  this  constraint  as  basis  to  formulate  the 
resolution  of  the  job-shop  problem. 

Consider  a  permutation  constraint  perniutation(xi,  ■  •  •  which  expresses  the 
fact  that  the  n-uple  (3:1, ... ,  x^)  is  a  permutauion  of  the  n-uple  (1, . . . ,  n).  Based  on 
the  distinct  integers  constraint  which  requires  that  n  integers  ori, . , . ,  are  distinct, 
the  permutation  constraint  can  be  stated  equivalently  as: 

Oi  €  [1,  n],  for  1  <  i  <  n 

Ol^  -  '  ^  On 

In  the  light  of  the  permutation  constraint,  the  sorting  can  be  brought  into  the  scope 
of  constraint  programming:  that  (?/i, . . . ,  Pn)  is  the  ascending  sorting  of  (a:i, . . . ,  Xn) 
is  equivalent  to  that  there  exists  permutation  (oi, . . . ,  o„)  of  {1, . . . ,  n}  such  that 
Vi  <  •'■  <  Vn  and  Xi  =  t/o.-  for  1  <  i  <  n.  So  with  occurrence  of  the  permutation 
variables  Oi, . . .  ,o„,  the  sorting  relation  shall  be  defined  as: 

(Xi ,  .  .  .  ,  Xn  ?  2/1 5  •  •  •  j  l/n  5  j  ■  '  •  j  )  : 

sorting  =  i  is  permutation  of  (l,...,n), 

I  J/i  <  •  •  •  <  J/», 

=  yoi,---,Xn  =  yo^ 

The  goal  is  then  to  solve  the  constraint  sorting(a:i, . . .  2/1, . . . ,  ,  o„). 

Inspired  from  the  definition  of  the  sorting  relation,  a  basic  constraint  program  for 
it  is  as  follows:  for  I  <  i,j  < 

permutation(oi,  •  •  • ,  o„) 

2/1  <  2/2, -..,2/71-1  <  Vn 

(oi  <  j)  =>  (xi  <  yj) 

{Xi  <  yj)  =>  {oi  <  j) 

where  the  third  and  the  fourth  constraints  are  a  more  effective  version  of  the  x,  = 
constraint  (usually  called  constraint  of  z-th  element)  by  making  use  of  the  property 
that  2/1, . .  - ,  i/„  are  in  ascending  sorting.  They  are  respectively  decomposed  into: 

Oij  —  (o,-  ^  j),  bij  =  (x,-  <  yj)^  Oij  ^  bij 
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=  (xt  <  i/j),  b^j  —  (Oj  ^  i)}  Qij  ^  b^j 

where  aij.a^^.bijMj  intermediate  boolean  variables  taking  mine  0  (false)  and 
1  (true). 

We  use  the  following  example  to  illustrate  the  effect  of  our  constraint  program 
for  sorting: 

/[3,4]  [1,2]  [2,3]  [2,3]  \ 

[4,4]  [1.2]  [3,4]  [2,4] 

V[l,l]  [1,2]  [2,3]  [3,4]; 

can  be  maximally  narrowed  to: 

[4,4]  [1,1]  [3,3]  [2,2]  \ 

[4.4]  [1,1]  [3,3]  [2,2] 

[1,1]  [2,2]  [3,3]  [4,4]; 

The  basic  program  is  sound  and  complete  for  the  sorting  constraint.  But  its 
procedural  performance  is  not  quite  satisfactory.  For  instance,  due  to  locality  of 
the  constraint  propagation,  the  basic  set  of  constraints  can  reduce  no  interval  in  the 
following  case: 

/[3,4]  [1,2]  [2,4]  [1,3]  \ 

[3.4]  [1,2]  [3,3]  [2,2] 

V[l,2]  [1,3]  [2,4]  [3,4]/ 

A  trivial  redundant  constraint  to  the  sorting  constraint  is:  the  number  of  x,’s 
that  are  less  than  or  equal  to  j/,-  must  be  at  least  i  and  the  number  of  x,’s  that 
are  greater  than  or  equal  to  y,-  must  be  at  least  n  -  z  -hi.  In  meta-level,  it  can  be 
expressed  as  follows: 

{(di, . . . ,  dn)  is  ascending  sorting  of  (xi , .  - . ,  x^), 

(di, . . . ,  d'„)  is  ascending  sorting  of  (xi,  •  •  • , x„). 

To  remain  in  the  framework  of  constraint  solving,  we  encapsulate  the  above 
meta-level  constraint  into  the  following  constraint: 

y,-  =  sorti(xi,.. .  ,x„) 

understanding  that  y*  ranks  i-th  in  the  ascending  sorting  of  xi, . . .  ,Xn.  See  Table  1 
for  the  definition  of  the  sortfc  constraint. 

The  sortfc  constraint  turns  out  to  be  quite  useful  for  narrowing  the  intervals. 
The  basic  program  for  sorting  plus  the  sortfc  constraints  can  maximally  narrow  the 
intervals  of  the  above  example  to  be: 

[3,4]  [1,2]  [3,4]  [1,2]  \ 

[3,4]  [1,2]  [3,3]  [2,2] 

[1,2]  [2,2]  [3,3]  [3,4]; 


(Ol  02  O3  O4  \ 

Xi  X2  X3  X4 

yi  2/2  1/3  2/4  J 


Ol 

02 

03 

04 

^3 

2/1 

2/2 

2/3 

2/4 

/  Ol  O2  03  O4  \ 

Xi  X2  X3  X4 

\  yi  y2  2/3  2/4  / 


Ol 

02 

03 

04 

^2 

2:3 

X4 

2/1 

2/2 

2/3 

2/4 
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4  Solving  the  job-shop  problem 

Now  we  come  to  the  resolution  of  the  job-shop  problem.  If  we  introduce  the  following 
variables  for  the  unknown:  for  1  <i  <m,l  <  j  <n^ 

•  Oij  :  the  order  of  job  j  processed  on  machine  i. 

•  Xij  :  the  release  date  of  job  j  on  machine  i. 

•  yij  :  the  due  date  of  job  j  on  machine 

•  Uij  :  the  release  date  of  the  job  scheduled  j-th  on  machine  i. 

•  Vjj  :  the  due  date  of  the  job  scheduled  7-th  on  machine  i. 

•  w  :  the  maximum  of  the  completion  times  of  all  jobs. 

then  a  basic  constraint  program  can  be  given  as  follows:  for  1  <  f  <  m,  1  <  7  <  n, 

•  yTij,j  ^  precedence  constraint 

•  Vij  —  Xij  dij  duration  constraint 

•  sort  ing(x,‘i , . . . ,  ,  ■  •  - ,  Wm ,  0,1 , . . . ,  Of „ )  disjunctive  constraint 

sorting(2/a,  •  ■  • ,  Vim  Va,...,  0,1, . . . ,  o,„) 

Vij  <  Uij+i 

•  Vin  <  w  for  minimization 

The  sorting  constraints  require  that  the  release  date  Xij  (due  date  yij)  is  sorted  o,j-th 
in  the  scheduled  release  dates  u,i, . . . ,  ttm  (scheduled  due  dates  ,  Vin).  Then 

by  imposing  the  precedence  constraints  Vij  <  Uij^i  over  the  sorted  dates  to  make 
disjoint  the  scheduled  tasks,  the  disjunctive  constraints  of  the  job-shop  problem  can 
be  settled.  The  feature  of  such  an  approach  is  that  it  takes  into  account  the  ordering 
of  tasks  on  a  machine  globally  rather  than  pairwisely. 

The  goal  is  to  instantiate  all  of  the  ordering  variables  o,/s  which  make  up  a 
partial  solution  of  the  job-shop  problem.  In  the  following  subsections,  we  discuss 
the  branch-and-bound  method  for  instantiating  Oij^s  and  give  redundant  constraints 
to  promote  the  efficiency  of  the  constraint  program. 

4.1  The  search  strategy 

Optimization  is  carried  out  by  the  depth-first  branch  and  bound  search:  1.  each  time 
the  propagation  of  constraints  terminates,  if  not  all  ordering  variables  are  instan¬ 
tiated,  the  system  will  split  the  domain  of  some  ordering  variable  not  instantiated 
and  enumerate  the  split  parts  which  each  constitute  a  subproblem;  2.  each  time  a 
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solution  is  found,  a  new  upper  bound  (u)  —  1)  for  the  completion  times  of  all  jobs 
will  be  imposed. 

The  search  strategy  is  based  on  the  first-fail  principle.  Firsth^',  we  select  critical 

n 

machine  i  such  that  is  minimum.  Thus  the  machine,  on  which  the 

j=i 

total  slack  of  sorted  dates  is  minimum  and  thus  the  tasks  are  the  most  constrained, 
is  selected.  This  is  justified  by  the  aim  of  forcing  bottleneck  of  failures  (backtracks) 
as  early  as  possible  so  that  smaller  space  be  searched. 

Empirical  results  tell  that  fixing  the  first  critical  machine  until  all  jobs  on  it  are 
completely  scheduled  yields  better  results.  So  each  time  the  first  critial  machine  is 
selected,  it  will  be  fixed  until  the  schedule  on  it  is  done.  On  the  critical  machine  z', 
the  selection  of  critical  job  j  is  carried  out  in  accordance  with  the  following  criteria: 


1. 


lo.J 


is  minimum. 


2.  SfcLi  \okj\  is  maximum. 


3.  |q:i  —  a2|  is  maximum. 


where 


ai  = 


I  (1^  ^  H-  51- 1  Ig,) 

where  ft  and  ft  are  the  lower  half  and  the  upper  half  of  the  domain  of  o,j  . 

The  goal  is  to  select  the  job  for  which  the  total  slack  of  release  and  due  dates  of 
tasks  scheduled  in  the  interval  dom{o,j)  is  minimum  (like  the  criterion  for  selecting 
critical  machine,  this  criterion  is  also  in  the  hope  of  forcing  the  bottleneck  as  early 
as  possible).  In  case  of  tie,  select  the  job  for  which  the  total  slack  of  its  ordering  on 
all  machines  is  maximum.  To  further  break  tie  if  necessary,  select  the  job  with  the 
maximal  difference  between  the  biases  to  the  date  averages  of  the  jobs  scheduled  in 
the  interval  ft  and  ft.  Both  latter  are  in  the  hope  of  resulting  in  greatest  change 
to  the  constraint  sj^stem  when  splitting  the  inten'-al. 

For  branching,  we  split  the  domain  of  the  ordering  variable  in  two  and  enu¬ 
merate  the  two  branches  with  the  following  heuristic: 

if  Cki  <  a2  then  to  deal  with  the  lower  half  ft  first 
else  to  deal  with  the  upper  half  ft  first 

The  goal  of  this  heuristic  is  to  reach  the  solutions  as  rapidly  as  possible. 
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4.2  The  redundant  constraints 

The  basic  constraint  program  gives  a  clear  idea  for  solving  the  job-shop  problem. 
However,  the  hardness  of  the  problem  requires  us  to  add  into  the  constraint  system 
redundant  constraints  to  prune  the  search  tree  as  much  and  as  early  as  possible. 
The  idea  is  to  tighten  the  constraints  over  orders  of  the  tasks,  release  and  due  dates, 
and  the  scheduled  dates. 

For  task  orders,  consider  the  constraints  between  and  c>,jt  for  1  <  z  <  m  and 
I  <  j  <  k  <  n.  Based  on  the  fact  that  all  tasks  on  a  machine  must  be  disjunctive, 
the  following  equivalences  hold: 


(o,'j  <[  Oik)  —  ^  Oij)  —  {^ij  Vik)  —  {Vij  ^  ^ik)  (l) 

The  essence  of  these  constraints  is  to  tap  the  fact  that  due  to  disjunction,  that 
release  date  Xjj  is  less  than  due  date  yik  is  equivalent  to  that  task  (i,j)  precedes 
task  (i,k).  This  is  very  useful  in  pruning  the  search  tree. 

Moreover,  the  order  of  job  j  on  machine  i  must  be  equal  to  the  sum  of  (differ¬ 
ence  between)  minimal  order  (maximal  order)  and  the  number  of  tasks  preceding 
(succeeding)  it,  which  is  trivial: 

Oij  =  min  (if  Oik  <  Oij  then  Oik  else-l-oo)  -I-  ^  {oik  <  Oij)  (2) 

l<fc<n,fc7^i 

Oij  =  max  (if  Oij  <  Oik  then  Oik  else  -oo)  -  ^  (oij  <  Oik)  (3) 

l<k<n,k^j 

Similarly,  for  release  and  due  dates,  consider  the  constraints  between  Xij  and  Xik 
{yij  and  yik)  for  1  <  z  <  m  and  1  <  j  <  n.  The  release  dates  (due  dates)  of  job 
j  on  machine  z  must  be  greater  than  (less  than)  or  equal  to  the  sum  of  (difference 
between)  the  minimal  release  date  (maximal  due  date)  and  the  total  durations  of 
tasks  preceding  (succeeding)  job  j, 

Xij  >  min  (if  Oik<Oij  then  Xik  else  H-oo)  +  if  Oik<Oij  then  dik  else  0  (4) 

l^,k^ 

yij  <  max  (if  o,y  <  Oik  then  else  —  oo)  —  ^  if  Oij  <  Oik  then  dik  else  0  (5) 


For  constraints  (4)  and  (5),  meta-level  constraints  can  be  given  (similar  case  for 
constraints  (2)  and  (3)): 


^ij  >  min  Xik  + 

l<fc^  A 


dik 


Vij  <  max  yik  + 

1;^  A 


dik 


l<fc^  A  kjij  A  (o,j<o,t)=l 
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The  basic  idea  of  the  constraints  (1),  (4)  and  (5)  can  be  traced  back  to  the  work  of 
earlier  and  Pinson  and  that  of  Applegate  and  Cook.  While  constraints  (2)  and  (3) 
are  extentions  of  (4)  and  (5)  to  ordering  variables.  These  types  of  constraints  are 
widely  used  in  solving  the  job-shop  problem.  So  they  can  be  viewed  as  classical. 


Peculiar  to  our  approach,  powerful  redundant  constraints  can  be  used  over  sorted 
dates.  Let’s  consider  the  gaps  between  Vik  and  Uij  for  1  <  i  <  m  and  I  <  j  <  k  <  n. 
Firstly,  the  gap  between  due  and  release  dates  for  each  sorted  task  must  be  equal 
to  some  duration  of  the  tasks  (this  is  trivial); 

Vjj  Ufj  G  {dji,  •  •  • ,  dtn}  (^) 

Furthermore,  they  must  satisfy  the  following  constraints: 

Vij  —  Uij  >  min  if  Oik  =  j  then  dik  else  -l-oo  (7) 

■'  l<k<n 

Vij  ~  Uij  <  max  if  o,fc  =  j  then  dik  else  -oo  (8) 

Finally  we  wish  to  constrain  the  gaps  with  this  idea:  at  any  moment  during  the 
constraint  propagation,  regarding  the  inter\^al  [j,k]  of  orders,  we  can  evaluate  a 
lower  bound  for  the  gap  between  the  jobs  scheduled  in  this  interval  by 

•  adding  up  the  durations  of  the  jobs  whose  execution  orders  fall  in  [j,  A:]. 

•  adding  up  the  A;  -  j  +  1  minimal  durations  of  the  jobs  whose  execution  orders 
do  not  surely  fall  out  of  [j,  A:]. 

The  constraints  specified  by  us  are  as  follows: 

Vik  -  Uij  >  if  Oil  6  [j,  i:]  then  du  else  0  (9) 

l</<n 

Vik  -  Uij  >  if  0.1,  €  [j,  k]  then  du^  else  +oo  (10) 

l<p<fc-J+l 

for  some  distinct  indices  /i, . . . ,  lk~j+i 


where  the  constraint  of  partial  sum  xq  6  sigmap(a:i, . . .  ,a:„)  states  that  Xq  is  the 
summation  of  some  p  elements  of  a:i, . . .  ,a:„.  For  this  constraint,  a  meta-level  nec¬ 
essary  constraint  (thus  an  incomplete  algorithm  for  the  narrowing  operator)  is: 

di  +  ---  +  dp<t/<Cp+i  +  --*+<. 

(  (di , . . . ,  dn)  is  ascending  sorting  of  (£i , . . . ,  £^) , 
where  < 

[  (di, . . . , d'„)  is  ascending  sorting  of  (^, . . .  ,x;r). 

Constraints  (9)  and  (10)  are  very  useful  in  pruning  search  tree.  They  are  in  fact 
just  a  simplification  of  the  following  more  efficient  meta-level  constraints: 


min 

A  |5|=fc-J+l 


subject  to: 
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1.  if  doni(oi7)  C  [j,  fc]  then  dn  £  S 

2.  if  dom(ot7)  fl  [j,  A:]  =  0  then  du  ^  S 

Remarif:  All  complex  constraints  are  decomposed  into  primitive  constraints  in  our 
implementation  in  the  spirit  of  emulating  a  pure  constraint  language  which  supports 
the  constraint  solving  paradigm  as  described  in  section  2.  Besides  the  basic  primitive 
constraints,  the  redundant  constraints  reduce  to  the  following  o  specific  primitive 
constraints  (See  Table  1  for  definition); 

a:o  =  if  xi  then  X2  else  xs  conditional  constraint 

Xo  =  minJLj  x,-  constraint  of  minimum 

Xo  =  maxJLi  Xi  constraint  of  maximum 

xo  =  Xi  constraint  of  summation 

^0  €  sigma;;.(xi, . . . ,  x„)  constraint  of  partial  sum 

For  example,  Constraint  (10)  will  be  decomposed  into: 

Zlijk  =  Vik  —  Uij 

~  {^il  C  [j,  A;]) 

—  if  z2ijki  then  du  else  +oo 
z^ijk  6  sigm3.f^_j^^(z3ijku-  -  - ,  ^Sijkn) 

^^ijk  ^  ^^ijk 


4.3  The  empirical  results 

A  protot3^pe  of  the  interval  constraint  system  which  involves  all  the  primitive  con¬ 
straints  of  Table  1  has  been  developped  by  us  to  support  the  constraint  program 
for  the  job-shop  problem.  The  results  of  our  constraint  program  on  the  10  x  10 
instances  of  [1]  is  given  in  Table  2.  They  are  obtained  on  a  Sun  Sparcstation  10. 
The  fifth  column  gives  the  numbers  of  backtracks  needed  for  the  first  solution  while 
the  seventh  column  for  the  numbers  of  choice  points  of  search  trees.  As  far  as  the 
number  of  choice  points  examined  by  the  constraint  s\^stem  is  concerned,  the  experi¬ 
mental  result  of  the  approach  is  satisfactory  compared  to  specific  operations  research 
algorithms  (e.g,  [1,  5]).  An  interesting  result  is  that  the  ratio  is  nearly 

constant.  This  is  because  of  the  propagation  of  primitive  constraints.  The  maximal 
depths  of  search  trees  are  small  mainly  due  to  the  effective  control  of  the  distinct 
integers  constraints.  All  these  permit  us  to  better  predict  the  size  of  the  search  space 
and  the  execution  time.  Regarding  the  results  with  different  initial  upper  bounds, 
the  convergency  of  the  constraint  program  on  the  instance  mtlO  is  measured  in  Table 
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3.  The  convergence  speed  is  satisfactoiy  (compared  to  [1]).  Experiments  on  other 
10x10  instances  give  similar  results. 

For  the  10  x  10  instances  and  for  our  constraint  system,  the  current  constraint 
program  produces  about  36,000  variables  (basic  and  intermediate)  and  40,000  prim¬ 
itive  constraints.  In  general,  for  m  machines  and  n  jobs  problem,  the  number  of 
variables  and  the  number  of  constraints  are  both  pohmomial  {0{mn^)).  In  fact, 
we  sacrificed  remarkably  the  efficiency  of  space  pruning  as  well  as  the  speed  due 
to  restricting  our  algorithm  to  be  within  the  framework  of  propagating  primitive 
constraints  and  the  naive  enumeration  scheme.  In  practice,  a  solution  would  be 
to  apply  meta-level  constraints  so  as  to  enforce  the  pruning  of  a  search  tree.  For 
this,  the  last  column  of  Table  1  gives  the  numbers  of  choice  points  of  search  trees  if 
the  constraints  (2)-(5)  and  (9)-(10)  are  replaced  by  their  corresponding  meta-level 
constraints.  Since  the  redundant  constraints  produce  most  of  the  intermediate  vari¬ 
ables  and  primitive  constraints,  the  number  of  variables  and  constraints  will  greatly 
decrease  if  meta  level  implementation  of  the  redundant  constraints  is  allowed.  And 
in  practice,  some  of  the  redundant  constraints  (e.g,  constraints  (2),  (3),  (6),  (7),  (8)) 
which  are  relatively  weaker  in  pruning  search  trees  can  be  removed  at  user’s  option. 
On  such  considerations,  in  meta-level  our  redundant  constraints  are  even  easier  to 
implement  than  those  of  some  other  efficient  approaches  (e.g,  [2,  7]). 

Besides  the  10  x  10  instances.  Table  4  gives  some  results  for  several  instances 
of  larger  size.  For  these  results.  Constraints  (2)-(5)  and  (9)-(10)  are  all  replaced 
by  the  meta-level  constraints  and  are  implemented  wholely  as  large  constrmnts  (not 
decomposed).  The  main  aim  is  to  save  memory  and  promote  the  space  pruning 
power.  Interestingly,  the  ratio  constant  for  instances  of  same 

size  and  we  achieved  good  results:  the  program  found  the  optimal  solution  of  the 
10  machines  x  15  jobs  instance  la24  and  proved  its  optimality  by  examining  only 
616,298  nodes,  better  than  the  result  (16,115,842  nodes)  given  in  [1];  for  the  la40 
which  is  considered  hard  in  [1],  our  program  only  needs  to  examine  55,571  nodes 
for  the  optimal  solution  and  the  proof  of  optimality  (to  our  knowledge,  no  detailed 
solution  result  was  given  for  this  instance  in  the  literature) .  Considering  the  global 
search  controlled  by  the  distinct  integers  constraint  and  the  good  performance  on 
harder  instances,  our  constraint  program  is  expected  to  work  better  for  solving  job- 
shop  problems  of  larger  size.  Indeed  an  intuition  is  that,  our  constraint  program 
costs  constraint  processing  overhead  on  “small”  instances  like  mtlO  but  pays  off  in 
larger  instances.  From  the  angle  of  these  considerations,  the  performance  of  our 
constraint  program  compares  well  with  specific  operations  research  algorithms  (e.g, 
[1,5])  and  other  efficient  approaches  for  which  highly  practical  implementation  was 
pursued  (e.g,  [2,  7]). 
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Table  2:  solutions  of  ten  10  x  10  instances 


Ins¬ 

tance 

Initial 

upper 

bound 

Opt 

upper 

bound 

First  solution  found 

Opt  proven 

Max 

depth 

Nodes 

(meta 

level) 

Sol. 

bound 

Back¬ 

tracks 

Time 

(s) 

Total 

nodes 

n 

mtlO 

930 

930 

930 

3339 

. 

905 

11591 

3323 

23 

9209 

orbl 

1070 

1059 

1068 

4734 

1183 

73189 

18463 

34 

55828 

orb2 

890 

888 

890 

201 

52 

38826 

8126 

47 

27199 

orbs 

1021 

1005 

1020 

7840 

145304 

36290 

44 

98099 

orb4 

1019 

1005 

1019 

915 

7346 

1862 

26 

2736 

orb5 

896 

887 

896 

239 

63 

3883 

900 

31 

3156 

abzS 

1245 

1234 

1242 

3404 

860 

8218 

2138 

38 

6619 

abz6 

943 

943 

943 

22 

9 

4892 

1227 

28 

560 

lal9 

848 

842 

846 

1246 

367 

9130 

2339 

28 

7915 

la20 

911 

902 

910 

1533 

319 

22519 

5564 

47 

24163 

In  particular,  the  first  critical  machine  is  fixed  to  be  5th  for  orb3. 


Table  3:  convergency  anal^^sis  on  mtlO 


Initial 

bound 

First  solution  found 

Optimality  proven 

Max 

depth 

Sol.  bound 

Backtracks 

Time(s) 

Nodes 

Time(s) 

-foo 

1488 

0 

5 

69665 

12027 

229 

30000000 

1488 

0 

5 

69665 

12004 

229 

10000 

1488 

0 

5 

69665 

11985 

229 

1000 

996 

43 

10 

52050 

11175 

55 

950 

949 

315 

46158 

10328 

42 

940 

939 

10460 

2726 

22067 

5996 

37 

929 

— 

— 

— 

10954 

3220 

22 

Table  4:  Solutions  of  some  instances  of  larger  size 


Instance 

Init. 

hound 

Opt 

bound 

Sol.  found 

Opt  proven 

Nodes/s 

Backtracks 

Total  nodes 

la24  (10x15) 

935 

935 

355889 

616298 

49 

4 

la36  (15x15) 

1268 

1268 

1694 

3370 

58 

3 

la39  (15x15) 

1233 

1233 

2443 

3324 

53 

3 

la40  (15x15) 

1222 

1222 

541 

55571 

47 

3 

5  Concluding  remarks 

The  goal  of  this  paper  is  to  propose  a  new  approach  which  is  clean,  flexible  and 
robust  for  solving  the  job-shop  problem.  The  spirit  (different  from  that  of  usual 
approaches)  is  to  simulate  the  resolution  of  a  hard  problem  by  a  pure  constraint 
language.  So  by  clean  we  mean,  the  constraint  program  makes  use  of  pure  constraint 
programming  techniques.  In  other  words,  one  feature  of  our  constraint  program  is  its 
indenpendency  on  algorithmic  detail.  In  fact  all  the  primitive  constraints  of  Table  1 
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except  sortfc  and  sigma^.  exist  in  the  literature  of  constraint  systems  or  are  easy  to 
implement.  For  the  last  two  constraints  introduced  by  us,  we  can  only  narrow  the 
domains  of  part  of  the  variables.  However,  we  isolated  these  tw^o  constraints  for  the 
sake  of  better  stating  the  problem.  In  the  resolution  of  the  job-shop  problem,  they 
turn  out  to  be  very  useful  in  pruning  a  search  tree.  These  two  constraints  integrate 
the  conventional  sorting  algorithm  into  the  resolution  of  the  job-shop  problem,  and 
thus  completing  the  idea  of  sorting  the  release  and  due  dates  of  jobs. 

The  flexibility  of  our  approach  is  inherent  in  constraint  programming  methods, 
and  due  to  the  use  of  the  distinct  integers  constraint,  more  global  control  over 
the  task  orders  is  obtained.  The  flexibility  and  the  good  performance  exhibited  in 
solving  hard  instances  are  proofs  that  our  constraint  program  is  robust  for  solving 
the  job-shop  problem. 

The  gains  of  formalizing  the  job-shop  problem  by  sorting  the  release  and  due 
dates  of  the  jobs  on  the  machines  are:  a  pure  constraint  programming  foumulation 
is  obtained;  the  ordering  of  the  tasks  and  the  sorted  dates  can  be  constrained  as 
wished,  which  exhibits  flexibility  and  will  be  helpful  for  solving  realistic  scheduling 
problems;  a  simple  search  scheme  (enumeration  based  on  splitting  of  the  intervals) 
oriented  by  the  distinct  integers  constraint  makes  the  enumeration  more  global  (and 
thus  a  good  distinction  from  other  classical  search  strategy,  e.g,  edge-flnding) . 

For  future  work,  on  one  hand  we  will  try  to  promote  the  inference  power  of  our 
prototype  for  the  constraint  system  (e.g,  the  constraint  propagation  speed),  and 
on  the  other  hand,  to  solve  harder  instances  of  the  job-shop  problem  or  realistic 
scheduling  problems.  For  the  latter,  more  powerful  constraints  or  more  efficient 
search  strategy  ma}^  be  needed. 
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PSAP"*  is  an  aircraft  production  planning  decision  support  system,  whose 
aim  is  to  schedule  the  aircraft  production  over  the  next  years  (more  than  five 
years)  at  Dassault  factories.  This  long  term  production  consists  in  scheduling 
the  assembly  lines  where  large  sections  of  aircraft  are  manufactured.  A  section 
is  a  major  aircraft  part  (i.e.  cockpit,  wing,  rear  fuselage,  final  assembly,  . . . ).  Up 
to  400  aircraft  and  30  assembly  lines  are  concerned. 

The  objective  is  to  find  schedules  respecting  the  delivery  dates  for  all  aircraft 
and  being  a  satisfactory  compromise  between  two  production  criteria.  On  the 
one  hand,  the  storage  between  the  different  assembly  lines  of  the  factory  must 
be  minimized.  On  the  other  hand,  the  production  uses  highly  skilled  labour,  and 
a  change  of  production  rate  involves  a  work  overload.  The  aim  is  then  to  minimize 
the  storage  time  for  a  given  maximum  number  of  production  rate  changes. 

Production  planning  and  scheduling  are  complex  operations  involving  a  great 
number  of  constraint  s,  both  numerical  and  symbolic.  Unfortunately,  the  diffi¬ 
culty  to  express  these  constraints  doesn’t  allow  the  use  of  the  current  scheduling 
or  planning  tools.  A  first  implementation  has  been  done  in  the  CLP  language 
CHIP® .  The  constraints  used  in  PSAP  are  in  finite  domains.  Good  results  was 
found®  with  appropriate  heuristics;  nevertheless,  considering  the  very  large  size 
of  the  search  tree,  we  have  to  prune  it. 

Finding  an  optimal  solution  means  exploring  the  whole  search  tree.  The 
search  tree  size  depends  on  the  fact  of  choosing  Y  changes  out  of  an  X-element 
set  (let  {^)  branches)  and  on  the  fact  that,  for  each  change  there  are  P  possible 
values  (P^  branches);  with  P  the  number  of  possible  values  of  production  rate 
possible  values  on  the  same  assembly  line  (generally  the  P  average  value  after 
the  constraint  propagation  is  20),  X  the  number  of  aircraft  where  a  production 
rate  change  can  occur  and  Y  the  number  of  rate  changes  given  by  the  end-user 
(y  average  value  is  below  10). 

The  need  for  parallelism  has  also  been  felt  necessary  since  the  sequential 
optimization  could  run  for  small-size  data  sets  (i.e.  70  aircraft),  but  not  for 
actual  large-size  data  sets  (i.e.  250  aircraft).  A  parallel  implementation  has  been 

***  J.  Bellone  works  now  at  CR2a-DI,  32  me  des  cosmonautes  31400  Toulouse 
*  This  work  was  partially  funded  by  ESPRIT  project  number  6708,  APPLAUSE. 

^  COSYTEC:  CHIP  user  manual,  v.4.0,  COSY/ref/001,  june  93 
®  see  [J.  Bellone,  A.  Chamard  and  C.  Pradelles]  An  evolutive  planning  system  for 
aircraft  production  -  PAP ’92 
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realized  with  the  languages  ElipSys  ^  and  ECL*PS"  (ECL*PS®  is  the  successor 
of  ElipSys  ).  At  certain  nodes  of  the  search  tree,  instead  of  trying  each  possibility 
in  sequence  by  backtracking,  we  try  them  in  parallel  on  different  processors.  The 
parallelism  deals  with  predicates,  either  built-in  or  defined  by  user.  In  PSAP, 
the  parallel  predicate  is  the  ’’minimize”  procedure,  which  uses  branch  and  bound 
method  in  order  to  explore  the  search  tree. 

Numerous  benchmarks  were  conducted  and  concerned  only  the  schedule  of 
the  last  assembly  line,  which  is  the  most  difficult  to  schedule.  Two  cases  may  be 
identified:  the  case  where  the  whole  search  tree  is  taken  into  account,  and  the 
case  where  the  search  is  halted  when  the  solution  cost  is  lower  than  the  minimum 
bound.  On  the  one  hand,  the  solution  found  is  the  optimal  one,  on  the  other 
hand  is  a  sub-optimal  one. 

In  order  to  illustrate  these  cases,  two  benchmarks  are  presented;  they  are 
performed  on  a  drs6000  machine  with  4  processors.  To  the  left,  an  optimal  solu¬ 
tion  search  where  120  subsets  of  possible  production  rate  changes  are  explored 
in  parallel  (data  set  of  70  mirage  2000).  To  the  right,  a  sub-optimal  solution 
search  where  84  subsets  are  explored  indeed  of  the  364  possible  subsets  (date  set 
of  250  mirage  2000). 


number  of  workers 

1 

2 

3 

4 

number  of  workers 

1 

2 

3 

4 

time  in  sec. 

534 

280 

198 

160 

time  in  mn. 

212 

52 

22 

18 

speed-up 

1,9 

2.69 

3.32 

speed-up 

4.02 

9.7 

11.51 

In  this  kind  of  benchmarks  for  optimal  solution,  the  speed-up  gain  is  quasi-linear. 
Moreover,  another  problem  arises:  the  possible  gain  of  time  with  parallelism  con¬ 
sidering  the  constraint  propagation.  Indeed  when  the  optimal  solution  is  in  the 
left  side  of  the  tree  (depth  first  search),  it  will  be  found  very  quickly  whether 
using  the  parallel  method  or  the  sequential  one.  The  problem  here  is  to  know 
a  priori  the  parallel  grain  size.  In  the  PSAP  problem,  this  size  depends  on  the 
number  of  rate  changes,  on  the  data  set  size  and  on  the  first  solution  found.  In 
particular  in  sequential  execution,  if  the  first  solution  found  is  the  optimal  solu¬ 
tion,  the  parallel  grain  size  may  be  too  small  to  obtain  speed-ups  while  searching 
for  the  solution  optimality  proof  if  there  is  good  constraint  propagation. 

For  the  case  of  sub-optimal  search,  the  execution  may  be  super  linear.  These 
results  confirm  the  idea  that  the  position  of  the  required  solution  is  important 
in  order  to  obtain  good  results  of  speed-ups.  In  the  above  example,  the  solution 
is  in  the  right  side  of  the  tree,  therefore  the  more  workers  there  are,  the  faster 
the  solution  will  be  found.  Nevertheless,  there  may  exist  a  threshold  such  as 
using  n  workers  will  be  the  same  as  using  n  -  1,  whereas  using  n  -|- 1  will  yield 
a  significant  gain  of  time.  This  was  confirm  with  tests  on  a  12-processor  SGI 
computer.  These  levels  prove  that  the  parallel  gain  is  due,  for  a  big  part,  to  the 
positions  of  optimal  or  suboptimal  solutions  in  the  search  tree. 


^  ECRC:  user  manual  ElipSys  0.7  -  dec  93 
*  ECRC:  user  manual  ECL‘PS*  3,5  -  dec  94 
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1  Introduction 

The  establishment  of  local  consistency  in  a  database  can  be  shown  to  have  a 
positive  effect  in  reducing  the  complexity  of  search.  The  lower  complexity  of 
achieving  directed  arc-consistency  (DAC),  based  on  a  total  ordering  of  the  vari¬ 
ables,  relative  to  full  arc-consistency  (AC),  makes  it  the  more  viable  technique 
in  practical  database  systems.  In  many  high  volume  transaction  based  systems 
it  is  desirable  to  optimise  for  a  number  of  fixed  variable  orderings  in  advance, 
rather  than  rely  upon  dynamic,  run-time  techniques.  Achieving  DAC  across  a 
set  of  orderings  defines  the  new  problem  of  achieving  consistency  with  respect 
to  a  partial  ordering  of  the  variables.  This  problem  may  be  defined  as  partial 
arc-consistency  (PAC). 

2  The  Application  Domain 

Computerised  Reservation  Systems  (CRS)  are  used  to  control  almost  every  as¬ 
pect  of  the  travel  business  from  bookings  and  the  issuing  of  tickets,  to  inventory 
management  and  price  planning.  This  work  is  concerned  with  just  one  part  of  this 
process,  the  initial  availability  search  which  takes  place  prior  to  booking  (Battle 
etal.  1995b). 

Conventional  reservation  systems  typically  use  holiday  ‘packages’  stored  as 
low-level  ‘templates’  of  which  there  may  be  many  hundreds  of  thousands  of 
variations.  Because  there  is  so  little  shared  structure,  the  search  for  matching 
holidays  is  an  essentially  brute-force  process,  a  fact  that  makes  search  harder 
than  it  need  be.  The  business  aims  are  to  improve  the  flexibility  of  the  database 
using  relational  technology,  but  to  maintain  present  performance. 

At  any  one  time,  many  hundreds  of  travel  agent  sessions  may  be  in  progress, 
and  the  response  time  must  be  on  the  order  of  a  few  seconds.  In  such  high  vol¬ 
ume  transaction-based  systems,  the  database  typically  receives  only  a  narrow 
range  of  queries.  This  high  transaction  rate  favours  off-line  pre-processing  tech¬ 
niques  in  preference  to  dynamic,  run-time  optimisation.  The  run-time  process  is 
therefore  fairly  minimal,  with  much  of  the  query  processing  being  performed  in 
advance  (Battle  1995a),  so  the  results  can  be  embedded  within  a  conventional 
programming  language. 
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3  Partial  Arc  Consistency 

Arc-consistency  is  a  desirable  property  but  is  expensive  to  achieve.  A  more  prag¬ 
matic  approach  to  consistency  is  directed  arc-consistency,  which  is  able  to  exploit 
a  known  variable  ordering.  The  effect  of  DAC  is  to  reduce  the  thrashing  associ¬ 
ated  with  backtracking.  Savings  due  to  DAC  in  the  availability  search  averaged 
out  at  over  52  consistency  checks  (database  accesses)  per  run. 

The  definition  of  DAC  assumes  that  only  a  single  ordering  is  used  to  search 
the  graph.  Given  a  number  of  variable  orderings,  establishing  DAC  for  each 
ordering  in  turn  is  not  the  most  efficient  approach,  nor  is  DAC  guaranteed  if 
established  only  once  for  each  ordering  as  subsequent  runs  can  interfere  with  the 
consistency  of  earlier  orderings. 

PAC  can  be  established  using  an  arc-consistency  algorithm  (AC3),  initialised 
with  a  graph  derived  from  the  set  of  orderings  by  taking  the  union  of  the  ordered 
graphs  (The  directed  graph  for  a  sequence  is  formed  by  aligning  each  edge  along 
the  direction  of  the  ordering).  An  algorithm  such  as  AC3  will  then  establish 
DAC  for  every  ordering  corresponding  to  a  path  through  this  directed  graph 
that  visits  each  variable  exactly  once. 

This  adaptation  of  AC3  can  be  improved  by  exploiting  the  existence  of  cycles 
within  this  directed  graph,  allowing  it  to  be  partitioned  into  a  number  of  disjoint 
subgraphs.  The  dependencies  between  these  groups  are  noted.  Redundant  con¬ 
sistency  checks  may  be  avoided  by  ensuring  that  constraints  within  a  partition 
are  checked  before  those  leading  out  of  the  partition.  To  avoid  any  partition 
being  checked  twice  they  must  be  arranged  in  order  such  that  no  group  depends 
upon  a  later  group.  These  partitions  are  processed  in  a  single  directed  pass,  with 
AC  being  achieved  within  each  partition. 

Directed  arc-consistency  is  achieved  in  a  given  total  ordering  if  every  con¬ 
straint,  2  i,  is  checked  after  every  revision  of  variable  j.  Where  the  relation¬ 
ship,  j,  does  not  belong  to  a  cycle,  i  and  j  are  placed  in  separate  partitions, 
so  that  all  revisions  to  j  are  made  before  the  constraint,  i  -)■  j,  is  ever  checked, 
ensuring  DAC  on  that  constraint.  If  the  constraint,  i  j,  is  part  of  a  cycle, 
PAC  establishes  AC  which  again  implies  DAC  on  that  constraint.  PAC  there¬ 
fore  guarantees  DAC  for  every  totally  ordered  graph  that  is  a  subset  of  the  input 
graph. 

In  the  travel  database,  ACS  required  67  consistency  checks  to  enforce  PAC, 
as  opposed  to  only  58  for  the  algorithm  outlined  above.  This  13%  reduction 
is  within  the  maximum  30%  saving  predicted  in  tests  on  randomly  generated 
constraint  graphs. 
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Abstract.  HCLP  extend  CLP  to  include  constraint  hierarchies.  We 
present  an  algorithm  based  on  our  previous  work  and  on  the  extended 
notion  of  comparators  for  comparisons  between  the  hierarchies  that  sirise 
from  alternate  rule  choices  in  the  program. 

Keywords:  Constraints,  Constraint  hierarchy,  CLP,  HCLP 


1  Introduction,  Extended  Theory  and  Houria  Review 

A  prototype  implementation  of  HCLP(Real,  Locally-Predicate-Better)  is  de¬ 
scribed  in  [1],  experience  with  writing  programs  in  HCLP(R,LPB)  has  provided 
many  examples  where  the  LPB  comparator  fails,  ruling  out,  non-intuitive  so¬ 
lutions.  This  results  from  HCLP’s  inability  to  discriminate  between  solutions 
arising  from  different  rule  choices.  The  need  for  inter-hierarchical  comparisons  is 
necessary.  To  achieve  that,  the  extension  of  the  original  theory  in  [2]  consists  of 
defining  the  set  of  solutions  to  many  constraint  hierarchies.  This  lays  the  theoret¬ 
ical  foundation  for  inter-hierarchy  comparators  and  will  allow  to  eliminate  the 
undesirable  solutions.  So  =  {Oh  :  Vc  €  ho  Sat(0,c)}  (i.e.  the  set  of  valuations  that 
satisfy  the  hard  constraints  in  the  hierarchies)  and  5  =  {Oh  :  Oh  €  5o  AVt^v  £  So^ 
heiteT{rfh>,Oh,H)}’  Two  types  of  comparators  are  distinguished:  t!  he  Locally  ~ 
Better  and  the  Globally  —  Better  comparators.  Only  the  Globally  —  Better  com¬ 
parators  are  extended  to  compare  valuations  arising  from  different  hierarchies 
since  they  take  some  aggregate  measure  to  combine  the  errors  obtained  in  each 
level  of  the  hierarchy.  The  original  version  of  the  Houria  solver  implements  the 
global  comparator  Unsatisfied  —  Count  —  Better  [3].  Houria  II  and  Houria 
III  are  extensions  of  Houria.  They  handle  different  classes  of  labeled  soft  con¬ 
straints  where  each  class  may  contain  weighted  constraints.  The  global  criteria 
used  respectively  are  the  Unsatisfied  —  Count  —  Best  ^  Case  —  Better  and  the 
Weighted  —  Sum  —  Better  comparators  [4]. 

2  Algorithm  for  Inter-Hierarchy  Comparisons 

We  describe  our  approach  with  the  following  steps  : 


—  Form  the  set  H  of  the  constraint  hierarchies  resulting  from  the  alternate  rule 
choices  in  the  HCLP  program. 

—  For  each  hierarchy  in  H 
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•  Compute  the  agregate  weight  of  the  satisfied  constraints  depending  on 
the  criterion,  that  contain  only  the  bounded  variables  (denoted  by  w  — 
hounded). 

•  Compute  the  weight  of  the  constraints  that  contain  at  least  one  free 
variable  (denoted  by  ii;  —  free). 

—  Eliminate  from  H  the  set  of  the  hierarchies  of  which  the  sum  of  —  hounded 
and  w  -  free  is  strictly  less  than  the  sum  of  another  hierarchy  in  H. 

~  Order  the  resulting  set  H  by  the  criterion  sum  oiw-  hounded  and  w  -  free 
decrecising. 

Repeat 

h  4—  pop(H) 

call  Houria  solver  in  order  to  extract  from  h 

the  maximum  subset  of  the  constraints  in  h  that 
contains  free  variables  and  can  be  solved 

-  H'  ^  push{h) 

Until  (CfT  is  empty)  or  (the  sum  of  the  weight  of  the 
maximum  subset  and  w  -  hounded  is  not  less  than 
the  sum  of  w  —  hounded  and  w  —  free  of  the  hierarchy 
in  the  head  of  H)) . 

—  Keep  in  H'  only  the  hierarchies  that  contain  the  maximal  weight. 

-  The  free  variables  of  each  hierarchy  in  H'  are  computed  and  returned. 

3  Conclusion 

Our  algorithm  can  be  included  in  the  HCLP  languages.  It’s  promising  when 
the  alternate  rule  choices  number  of  each  predicate  is  not  large,  in  the  opposit, 
Houria  system  can  be  used  in  a  B&B  algorithm  to  obtain  the  inter-hierarchy 
comparisons. 
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1  Summary 

We  describe  the  design  and  implementation  of  a  finite  domain  constraint  solver 
embedded  in  SIC  St  us  Prolog  system  using  an  extended  unification  mechanism 
via  attributed  variables  as  a  generic  constraint  interface.  The  solver  is  based  on 
indexicals^  i.e.  reactive  functional  rules  performing  incremental  constraint  solving 
or  entailment  checking.  Propagation  is  done  using  the  arc-consistency  algorithm 
AC-3,  adapted  for  non-binary  constraints.  At  the  heart  of  the  algorithm  is  an 
evaluator  for  indexicals. 

The  solver  provides  the  usual  predefined  seairch  strategies  (fixed  order,  fail- 
first  principle,  branch  and  bound  for  minimization  or  maximization).  Access 
predicates  for  the  relevant  variable  attributes  (domain  value,  bounds  and  size 
etc.)  are  also  provided,  making  customized  search  strategies  easily  program¬ 
mable. 

A  design  goal  has  been  to  keep  the  solver  open-ended  and  extendible  as  well 
as  to  keep  a  substantial  part  written  in  Prolog,  partly  contradicting  conventional 
wisdom  in  implementing  constraint  solvers. 

2  Design  Overview 

The  solver  hcis  an  open-ended  design  in  several  senses:  (1)  The  user  can  define 
new  elementary  constraints  in  terms  of  indexicals.  (2)  Non-local  constr£tint  prop¬ 
agation  is  available  via  global  constraints  defined  by  the  user  via  a  progreimming 
interface.  Such  constraints  may  use  specialized  algorithms  for  application  specific 
constraint  solving.  (3)  An  elementary  constraint  can  be  linked  to  a  0/1  variable 
denoting  its  truth  value. 

Indexicals  are  used  to  define  both  the  constraint  solving  and  the  entailment 
checking  cispects  of  a  constraint.  Arithmetic  expressions  and  symbolic  constraints 
are  compiled  to  elementary  ones  or  to  indexicals  (see  e.g.  [2]).  Constraints  can 
also  be  arbitrarily  combined  using  the  propositional  connectives. 

The  interface  between  the  SICStus  Prolog  engine  and  the  solver  is  provided 
in  part  by  the  attributed  variables  mechanism  [3],  which  has  been  used  previ¬ 
ously  to  add  several  constraint  solvers  to  SICStus  and  ECL*PS®  Prologs.  This 
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mechanism  associates  soiver-specific  information  with  variables,  and  provides 
hooks  for  extended  unification  eind  projection  of  answer  constraints. 

The  generality  and  flexibility  of  the  indexical  approach  comes  with  a  cost. 
Indexicals  have  small  grain  size;  the  ratio  of  useful  invocations  is  low  for  many 
problems;  they  prune  only  one  variable  at  a  time,  and  incur  high  scheduling 
overhead  since  they  produce  many  suspensions  for  a  n-ary  constraint). 

Global  constraints,  on  the  other  hand,  maintain  consistency  over  an  arhHrary 
amount  of  variables,  are  resumed  when  needed  (under  certain  constraint  specified 
conditions),  and  use  specific^  incremental  algorithms  for  each  constraint,  e.g. 
graph  algorithms  and  OR  techniques. 

In  our  framework,  we  mix  indexicals  and  global  constraints  by  having  sepa¬ 
rate  scheduling  queues  for  the  two;  a  global  constraint  is  only  resumed  when  no 
indexicals  are  scheduled  for  execution.  Thus,  global  constraints  can  be  seen  as 
having  lesser  priority  than  indexicals.  This  is  reasonable,  since  indexicals  (per 
invocation)  are  cheap,  while  specialized  algorithms  for  global  consistency  are 
often  expensive. 

Global  constraints  are  defined  via  a  programming  interface,  making  it  pos¬ 
sible  to  incorporate  problem  specific  cdgorithms  to  enhance  propagation  power. 
For  an  occurrence  of  a  global  constraint  c,  the  interface  provides  services  to 
suspend  and  resume  c  on  a  collection  of  variables,  to  maintain  a  private  state 
containing  information  used  by  incremental  algorithms,  and  to  prune  variables 
and  propagate  the  effects.  The  constraint  specific  algorithm  is  responsible  for  de¬ 
termining  (dis)entailment,  and  for  computing  the  new  domains  of  any  variables 
to  be  pruned. 
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A  constraint  network  is  defined  by  (X,  D,  C,  72),  where  X  is  a  set  of  n  variables 
{xi^. .  .Xn}  and  D  is  a  set  of  n  domains  {Di, . .  .Dn}  such  that  Di  is  the  set  of 
the  possible  values  for  variable  Xi.  C  is  a  set  of  e  binary  constraints  where  each 
Cij  6  C  is  a  constraint  between  the  variables  Xi  and  xj  defined  by  its  associated 
relation  Rij  defining  the  allowed  pairs  of  values  for  Xi  and  xj  (i.e  Rij  C  Di  xDj). 
Moreover  (6,  a)  e  Rji  <=>  (a,  6)  £  Rij.  The  fact  that  (a,  6)  €  Rij  will  be  denoted  by 
Rij(a,  b)  is  true.  Recently,  efficient  algorithms  have  been  proposed  to  achieve  arc- 
and  path-consistency  in  constraint  networks  (for  all  definitions  and  references,  see 
[ChmJeg96b]).  The  best  path-consistency  algorithm  proposed  is  PC-6  which  is  a 
natural  generalization  of  AC-6  [Bessi94]  to  path-consistency  [ChmJeg95].  Its  time 
complexity  is  0{n^d^)  and  its  space  complexity  is  0(n^(P).  Unfortunatly,  we  have 
remarked  that  PC-6,  though  it  is  widely  better  than  PC-4,  was  not  very  efficient 
in  practice,  specialy  for  those  classes  of  problems  that  require  an  important 
space  to  be  run.  Therefore,  we  propose  here  a  new  path-consistency  algorithm 
called  PC-8,  the  principle  of  which  is  also  used  to  define  a  new  algorithm  to 
achieve  arc-consistency,  called  AC-8.  For  details  on  algorithms  AC-8  and  PC-8 
see  [ChmJeg96b]. 

AC- 8  and  PC-8  are  based  on  the  notion  of  support.  For  arc-consistency,  a 
support  for  a  value  a  £  Di  w.r.t.  a  constraint  Cij  is  a  value  h  £  Dj  compatible 
with  a,  i.e.  such  that  Rij  {a,  b)  is  true.  If  a  value  a  £  Di  does  not  possess  a  support 
w.r.t.  a  constraint  this  value  cannot  satisfy  arc-consistency  and  then  must 
be  removed  from  its  domain.  For  path-consistency,  a  support  is  then  a  value  that 
allows  a  pair  of  values  to  be  compatible:  a  pair  of  values  (a,  6)  £  Rij  is  supported 
by  the  value  c  6  -Dfc  if  the  relations  i2j^(a,  c)  and  Rjk{bj  c)  hold.  AC-8  is  based  on 
supports  but  without  recording  any  of  them.  When  a  value  (i,  a)  is  removed  from 
its  domain  J9i,  AC-8  records  i,  the  reference  of  the  variable  Xi  in  a  list  denoted 
List- AC.  Propagations  will  be  realized  w.r.t.  variables  in  this  list.  Suppose  that 
i  is  removed  from  the  list,  then  all  neighbouring  variables  will  be  considered,  i.e. 
for  all  j  E  X  such  that  Cji  £  C,  and  for  each  value  b  £  Dj,  AC-8  will  ensure 
that  there  is  a  value  a  £  Di  such  that  Rij{a,b)  holds.  Unlike  AC-6,  AC- 8  has  to 
start  again  the  search  from  the  first  value  of  the  domains.  If  no  support  a  of  6 
is  found  in  Di,  then  b  must  be  deleted,  and  j  must  be  inserted  in  List- AC.  The 
inititalization  phase  consists  in  checking  if,  for  each  j  £  X  such  that  Cij  £  C, 
there  exists  at  least  one  support  per  value  a  £  Di.  So,  if  for  some  variable  Xj, 
a  has  no  support  in  Dj,  it  must  be  deleted  and  i  must  be  added  to  the  list. 
Concerning  the  propagation  phase,  AC- 8  restarts  looking  for  a  new  support  from 
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the  first  value  of  the  domain.  Finally,  the  scheme  of  AC-8  is  a  classical  scheme 
of  propagation:  the  algorithm  stops  when  the  list  of  propagation  becomes  empty, 
each  step  corresponding  to  the  propagation  of  a  deletion.  The  space  complexity 
of  AC-8  is  bounded  by  the  size  of  the  list,  so  it  is  0(n)  while  the  time  complexity 
of  AC-8  is  0(ed3). 

PC-8  is  based  on  the  same  principles  as  AC-8  and  appears  to  be  an  optimiz¬ 
ation  of  PC- 7  [ChmJeg96]  with  better  space  complexity.  When  a  pair  of  values 
(a,  c)  is  removed  from  a  relation  Rik,  two  3-tuples  will  be  recorded  in  a  list  called 
List-PC:  (i,a,k)  and  (kjC,i).  Propagations  will  be  realized  w.r.t.  these  3-tuples: 
if  (z,  a,  k)  is  propagated,  then  for  all  values  6  G  Dj  such  that  Rij(a,  b)  holds,  the 
propagation  must  verify  that  there  exists  c  e  Dk  which  supports  (a,  6)  G  Rij. 
If  no  support  c  of  (a,  6)  is  founded  in  Dk,  then  (a,  6)  must  be  deleted,  and  two 
3- tuples,  namely  (i,  a,  j)  and  (j,  6,  z)  have  to  be  inserted  in  List-PC.  The  initializa¬ 
tion  phase  consists  in  checking  if  there  exists  at  least  one  support  c  £  Dk,'ik  E  X 
per  pair  of  values  (a,  6)  G  Rij>  So,  any  pair  with  no  support  must  be  deleted  and 
two  3-tuples,  (z,a,  j)  and  (i,  6,  z),  must  be  added  to  the  list.  Finally,  the  scheme 
of  PC-8  is  slightly  different  from  the  scheme  of  PC-4  or  PC-6,  but  it  is  a  classical 
scheme  of  propagation  since  propagations  stop  when  the  List- PC  becomes  empty. 
The  space  complexity  of  PC-8  is  0(n^d)  while  the  time  complexity  of  PC-8  is 
0(n3d4), 

In  [ChmJeg96b],  experiments  were  performed  over  randomly  generated  CSPs. 
AC-8  was  compared  with  AC-3,  AC-4  and  AC-6.  We  have  found  that  for  the 
number  of  consistency  checks,  AC-6  is  the  best  one.  By  contrast,  concerning  CPU 
time,  AC-3  and  AC-8  are  similar  and  both  outperform  AC-6.  Concerning  path- 
consistency  algorithms,  PC-8  was  compared  with  PC-2,  PC-6  and  PC-7.  From 
our  experiments,  it  is  clair  that  PC-6  realizes  the  smallest  number  of  consistency 
checks  while  PC-7  and  PC-8  that  are  similar,  outperform  PC-2.  For  CPU  time  as 
a  measure  of  performance,  PC-7  and  PC-8  are  clearly  the  best  algorithms.  The 
fact  that  for  CPU  time,  PC-8  (resp.  AC-8)  outperforms  PC-6  (resp.  AC-6)  can  be 
explained  naturally  by  considerations  lied  to  the  theoretical  complexity  in  time 
and  space.  As  AC-6,  the  data  structures  of  PC-6  leads  to  an  optimal  theoretical 
time  complexity,  but  they  increase  the  CPU  time  because  of  the  required  number 
of  operations  for  each  propagation,  which  is  widely  more  important  than  the  one 
of  PC-8  (resp.  AC-8).  So,  the  multiplicative  hidden  constant  of  PC-6  (resp.  AC-6) 
is  widely  greater  than  for  PC-8  (resp.  AC-8). 
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We  present  a  new  approach  for  determining  consistency  of  temporal  con¬ 
straint  networks  bcised  on  Allen’s  interval-based  framework  [A1183].  The  tempo¬ 
ral  network  is  first  translated  into  a  clausal  theory  in  propositional  logic  [Men64], 
whose  satisfiability  is  then  determined  using  an  anytime  family  of  tractable  rea- 
soners  [Dal96]. 

Anytime  reasoners  are  complete  reasoners  that  provide  partial  answers  even 
if  stopped  prematurely;  the  completeness  of  the  answer  improves  with  the  time 
used  in  computing  the  answer.  Our  anytime  family  ho ,  Hi , ...  is  built  upon  clausal 
boolean  constraint  propagation  (BCP)  [McA90],  a  variant  of  unit  resolution.  It 
is  known  that  each  h,-  is  sound  and  tractable,  each  hf^-i  is  at  least  as  complete 
as  ht,  and  each  propositional  theory  h2is  a  complete  reasoner  h,*  for  reasoning 
with  it. 

A  straight-forward  translation  of  a  network  into  a  propositional  theory  can 
be  obtained  by  directly  instantiating  each  entry  of  Allen’s  transitivity  table  by 
relations  among  each  triple  of  intervals  in  the  network.  We  improve  upon  this 
by  first  embedding  a  tree  in  the  network  and  then  using  the  edges  of  this  tree 
to  restrict  the  number  of  formulcis. 

A  temporal  network  is  a  directed  graph  whose  edges  are  labeled  by  subsets 
of  13  basic  temporal  relations  defined  by  Allen.  Consider  any  maximal  tree  M 
embedded  in  the  given  connected  network  N  (our  approach  extends  easily  to 
non-connected  networks).  The  edges  in  M  are  called  tree  edges,  and  the  rest  of 
the  edges  in  N  are  called  back  edges.  The  notions  of  parent,  ancestor,  etc.  are 
defined  as  usual,  with  respect  to  the  tree  M.  For  each  tree  edge  {x,  y),  Rel{x,  y)  is 
defined  to  be  the  label  of  edge  {x,  y),  and  for  sub-branch  x\, . . .  ,Xk  oi  nodes  in  M , 
where  k  >  2,  Rel{xi,Xk)  is  defined  to  be  the  set  Rel(xi,X2)o. .  .oRel{xk-i,Xk)  of 
relations.  Note  that  o  is  the  composition  relation  defined  by  Allen.  The  translated 
theory  Ti{N)  with  respect  to  tree  M  consists  of  exactly  the  following  formulas: 

I:  xriy  V  ...  V  xrky  for  each  edge  {x,  y)  with  label  {ri, . . . ,  r^}  in  N; 

II;  ->xry  V  -^ytz  V  xriz  V  . . .  V  xrkZ  for  each  triple  x,y,z  oi  nodes  such  that  x  is 
a  proper  ancestor  of  y  and  y  is  a  proper  ancestor  of  z,  for  each  relation  r  in 
Rel{x,  y),  and  for  each  relation  t  in  Rel{y,  z),  where  r  o  t  =  {ri, . . . ,  rjk}; 
III:  similar  to  formulas  II  above,  except  that  z  =  x,  x  is  an  ancestor  of  y,  and 
there  is  a  back  edge  from  y  to  x;  and 
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rv :  zez  and  ~^zuz  for  each  node  z  with  an  incoming  back  edge  and  each  relation 
u  different  from  e  (equal)  such  that  the  atom  zuz  occurs  in  some  formula  III 
above. 

Theorem  1  shows  that  the  above  translation  is  sound  and  complete: 
Theorem  1.  A  network  N  is  consistent  iff  the  theory  Tt{N)  is  satisfiable. 

We  present  an  algorithm  that  obtains  a  translated  theory  for  any  given  tem¬ 
poral  network  N,  and  prove  that  the  time  taken  by  the  algorithm  and  the  size 
of  the  translated  theory  Tr(7V')  are  both  bounded  by  O(nm^),  where  n  is  the 
number  of  nodes  in  N  and  m  is  the  length  of  the  longest  branch  in  the  embed¬ 
ded  tree  M.  Since  m  can  grow  as  large  as  n,  the  worst-case  size  of  a  translated 
theory  is  cubic  in  the  size  of  the  network. 

Since  we  are  interested  only  in  determining  consistency  of  temporal  networks, 
we  use  the  anytime  family  Hq,  hi, ...  to  determine  the  satisfiability  of  translated 
theories.  The  least  k  needed  for  obtaining  f  from  the  translated  theory  using  hjb 
is  called  the  intricacy  of  the  network.  Since  each  hjb  can  be  determined  in  time 
exponential  in  A?,  the  intricacy  of  a  network  captures  the  difficulty  of  detecting 
its  inconsistency  using  our  approach. 

We  randomly  generated  thousands  of  small  networks  for  comparing  our  ap¬ 
proach  with  three  other  incomplete  reasoners.  We  found  that  intricacy  is  1  for 
all  networks  that  were  found  inconsistent  by  Allen’s  path  consistency  algorithm. 
However,  the  other  two  algorithms,  that  first  translate  interval  networks  into 
point  algebra  networks  (with  and  without  7^,  respectively),  could  not  detect 
inconsistency  in  most  networks  with  intricacy  1.  We  have  not  yet  found  any 
inconsistent  network  with  intricacy  greater  than  2. 

Since  the  translated  theories  are  quite  large,  our  current  approach  can  be  used 
only  for  networks  with  a  small  number  of  intervals.  We  are  currently  working  on 
further  reducing  the  size  of  the  translated  theory,  by  translating  fewer  transitivity 
rules.  We  are  also  extending  our  approach  to  handle  qualitative  constraints  like 
“A  before  B  or  B  before  C”  that  can  not  be  expressed  in  temporal  networks.  Our 
current  work  also  involves  inferring  implicit  temporal  constraints  from  temporal 
networks,  rather  than  just  determining  whether  a  network  is  consistent. 


References 

[AI183]  J.F.  Allen.  Maintaining  knowledge  about  temporal  intervals.  Communica¬ 
tions  of  the  ACM,  26(ll):832-843,  1983. 

[Dal96]  M.  Dalai.  Anytime  families  of  tractable  propositional  reasoners.  In 
Fourth  International  Symposium  on  Artificial  Intelligence  and  Mathematics 
(AI/MATH-96),  pages  42-45,  Florida,  1996. 

[McA90]  D.  McAUester.  Truth  maintenance.  In  Proceedings  Eighth  National  Confer¬ 
ence  on  Artificial  Intelligence  (AAAI-90),  pages  1109-1116,  1990. 

[Men64]  E.  Mendelson.  Introduction  to  Mathematical  Logic.  Van  Nostrand,  Princeton 
N.J.,  1964. 


From  Constraint  Minimization  to  Goal 
Optimization  in  CLP  Languages 


Frangois  Fages 

LIENS  CNRS,  Ecole  Normaie  Superieure, 

45  rue  d’Ulm, 

75005  Paris,  France 

e-mail : f agesOdmi . ens . f r  http : //www . ens . f r 

Constraint  logic  programming  and  concurrent  constraint  programming  are 
simple  and  powerful  models  of  computation  that  have  been  implemented  in  sev¬ 
eral  systems  over  the  last  decade,  and  proved  successful  in  a  variety  of  appli¬ 
cations  ranging  from  combinatorial  optimization  problems  to  complex  system 
modeling.  In  particular  the  CLP  approach  is  increasingly  used  to  solve  hard 
scheduling  and  planning  problems.  Of  course  in  these  applications  the  ability  of 
the  system  to  generate  not  all  solutions  but  only  best  solutions  is  a  fundamen¬ 
tal  property.  The  basic  optimization  procedure  currently  used  in  CLP  systems 
is  a  variant  of  the  branch  and  bound  procedure,  where  constraints  are  used  to 
prune  the  search  space.  That  procedure  can  be  used  to  find  optimal  solutions  of 
the  top-level  query  w.r.t.  an  objective  function,  but  it  becomes  unsound  when 
applied  to  subgoals  of  the  program.  The  extension  of  CLP  languages  with  op¬ 
timization  predicates  is  an  important  issue  to  solve  multi- criteria  optimization 
problems  and  modelize  multi-component  systems  for  which  several  optimization 
goals  have  to  be  combined  in  the  query  and/or  the  program.  The  problem  is  to 
reconciliate  the  evaluation  procedures  for  optimization  goals  with  the  declarative 
semantics  of  CLP,  and  its  properties  of  compositionality. 

In  this  poster  we  review  several  forms  of  optimization  within  CLP  languages, 
and  study  different  execution  models  which  are  complete  w.r.t.  Kunen’s  three¬ 
valued  logical  semantics  of  the  program's  completion.  First  we  define  the  con¬ 
straint  minimization  problem  that  the  constraint  solver  is  assumed  to  solve,  and 
the  basic  branch  and  bound  procedure  used  for  query  optimization  w.r.t.  an 
objective  function.  Then  we  show  how  optimization  subproblems  can  be  encap¬ 
sulated  in  CLP  programs  with  an  optimization  higher-order  predicate,  which  is 
interpreted  declaratively  under  Kunen’s  three- valued  logical  semantics  of  general 
CLP  programs,  and  for  which  we  give  a  complete  top-down  evaluation  procedure. 
In  this  approach  optimization  predicates  can  be  combined  arbitrarily  in  the  pro¬ 
gram,  recursion  through  optimization  predicates  is  supported  without  any  re¬ 
striction.  The  top-down  procedure  is  based  on  a  concurrent  pruning  mechanism 
between  standard  derivation  trees,  the  successful  derivations  to  the  minimization 
goal  are  the  successful  derivations  in  the  main  tree  tpi  whenever  the  auxiliary 
tree  gets  finitely  failed  after  pruning  (see  figure  below). 

More  general  local  optimization  predicates  with  set  of  protected  variables 
have  the  power  of  general  CLP  programs  with  negation.  We  derive  a  more  com¬ 
plex  complete  top-down  procedure  from  the  scheme  of  constructive  negation  by 
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pruning  [1]  in  this  context,  and  propose  an  alternative  bottom-up  evaluation 
procedure  based  on  a  finitary  version  of  Fitting’s  operator  [1].  Our  claim  is  not 
that  the  complete  procedures  described  for  local  optimization  predicates  can  be 
used  directly  to  solve  efficiently  complex  optimization  problems  but  that  they 
can  serve  as  a  basis  for  designing  more  efficient  procedures  in  particular  cases, 
e.g.  under  termination  or  groundness  assumptions,  and  for  analyzing  their  com¬ 
pleteness  w.r.t.  the  declarative  semantics.  For  instance  if  the  local  optimization 
goals  are  delayed  until  the  protected  variables  get  instanciated  then  one  can 
clearly  rely  on  the  previous  simpler  procedure. 

Constructive  negation  by  pruning  can  be  used  also  to  interpret  directly  pref¬ 
erences  among  solutions  expressed  by  CLP  programs  (instead  of  objective  func¬ 
tions)  [2].  In  this  more  general  setting,  an  incremental  execution  model  based  on 
dynamic  constraint  solvers  and  on  the  set  of  operators  for  transforming  deriva¬ 
tions  described  in  [3]  is  discussed  w.r.t.  interactive  and  multi-criteria  optimiza¬ 
tion  problems. 
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In  1980,  Haralick  and  Elliott  [1]  introduced  the  forward  checking  (FC)  algo¬ 
rithm,  as  well  as  two  variants  which  they  called  partial  looking  ahead  (PL A)  and 
full  looking  ahead  (FLA).  Forward  checking  is  a  modification  to  backtracking 
search:  after  each  variable  is  instantiated,  values  from  the  domains  of  future  vari¬ 
ables  are  filtered  out  if  they  are  inconsistent  with  the  present  partial  assignment. 
The  extra  work  required  to  do  this  filtering  almost  always  pays  off  in  the  reduced 
size  of  the  search  space.  Full  looking  ahead  is  a  further  modification  which  does 
more  extensive  processing  of  future  variables  by  comparing  each  future  variable 
with  each  other,  in  effect  performing  a  single  iteration  of  the  arc-consistency  “re¬ 
vise”  procedure  described  in  [2].  Partial  looking  ahead  is  an  abbreviated  form 
of  full  looking  ahead,  where  future  variables  are  only  compared  with  those  after 
them  in  the  ordering. 

Over  the  last  15  years,  forward  checking  has  become  one  of  the  primary 
algorithms  in  the  CSP-solver’s  arsenal,  while  partial  and  full  looking  ahead  have 
received  little  attention.  This  neglect  is  due,  no  doubt,  in  large  part  to  the 
conclusions  reached  in  [1]:  “The  checks  of  future  with  future  units  do  not  discover 
inconsistencies  often  enough  to  justify  the  large  number  of  tests  required.” 

Our  experiments  demonstrate  that  when  combined  with  the  dynamic  variable 
ordering  heuristic  described  in  [1],  the  full  looking  ahead  algorithm  is  more  useful 
than  often  supposed,  and  in  fact  substantially  outperforms  forward  checking  on 
problems  with  relatively  tight  constraints  and  relatively  sparse  constraint  graphs. 
Because  we  find  experimentally  that  each  algorithm  is  superior  to  the  other  on 
certain  types  of  problems,  we  are  interested  in  the  possibility  of  automatically 
invoking  the  superior  heuristic  on  any  individual  problem.  Another  approach  is 
to  vary  the  amount  of  look  ahead  within  an  individual  problem,  adopting  either 
the  forward  checking  level  or  the  full  looking  ahead  level  on  different  sub-trees, 
according  to  some  heuristic.  We  present  below  three  new  variants  of  looking 
ahead  which  take  this  approach,  using  three  different  heuristics. 

Experiments  reported  in  the  full  paper  (available  through  http://www.ics . 
uci .  edu/" dechter)  indicate  that  the  extra  work  performed  by  full  looking  ahead 
is  most  effective  higher  up  in  the  search  tree.  We  therefore  developed  a  version 
of  full  looking  ahead  which  we  call  truncated  looking  ahead  (TLA).  The  modifi¬ 
cation  is  simple:  the  extra  processing  associated  with  full  looking  ahead  is  done 
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only  when  the  newly  instantiated  variable  is  at  depth  10  or  less  in  the  search 
tree.  At  greater  depths,  truncated  looking  ahead  is  identical  to  forward  checking. 

The  self-adjusting  looking  ahead  algorithm  (SALA)  starts  the  full  looking 
ahead  process  after  each  instantiation  at  every  level  in  the  search  tree,  but  stops 

(for  that  one  instantiation)  if  sufficient 
progress  is  not  being  made.  Progress  is 
defined  as  removing  a  suflScient  fraction  of 
the  values  in  the  domains  of  future  vari¬ 
ables,  and  is  controlled  by  a  parameter. 

One  advantage  of  performing  full  look¬ 
ing  ahead  is  that  many  values  in  the  do¬ 
mains  of  future  variables  are  removed. 
This  leads  to  another  advantage:  a  dy¬ 
namic  variable  ordering  heuristic  based  on 
domain  size  is  more  likely  to  find  a  future 
variable  which  hcis  a  very  small  domain. 
In  the  absence  of  a  dead-end,  we  hope  to 
find  a  future  variable  with  a  domain  size 
of  1,  since  instantiating  this  single  value 
represents  a  forced  choice  that  will  have 
to  be  made  eventually.  The  smart  looking 
ahead  (SLA)  algorithm  performs  the  full 
looking  ahead  level  of  consistency  enforc¬ 
ing  at  each  instantiation,  but  stops  when 
the  remaining  domain  size  of  some  future 
Increasing  graph  density  variable  becomes  0  or  1 .  The  goal  is  to  do 

enough  looking  ahead  to  guide  effectively  the  variable  ordering  heuristic. 

The  chart  shows  the  relative  performance  of  five  algorithms  (SALA  has  been 
omitted  for  legibility),  each  run  with  the  dynamic  variable  ordering  scheme  pro¬ 
posed  in  [1].  The  x-axis  indexes  10  sets  of  parameters  (number  of  variables, 
number  of  values  per  variable,  number  of  binary  constraints,  tightness  of  con¬ 
straints),  all  at  the  50%  solvable  crossover  point.  For  each  set  of  parameters 
we  generated  500  random  problems.  The  j/-axis  displays  the  ratio  of  each  algo¬ 
rithm’s  mean  CPU  time  to  that  of  forward  checking,  using  a  logarithmic  scale. 
Our  conclusion  is  that  smart  looking  ahead  is  the  most  promising:  its  perfor¬ 
mance  is  between  forward  checking’s  and  full  looking  ahead’s  in  six  out  of  ten 
cases,  and  is  better  than  both  in  the  other  four. 
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The  phenomenon  of  phase  transitions  occurring  in  many  classes  of  NP-complete 
problem  as  a  control  parameter  is  varied  has  been  recognised  and  studied  ex¬ 
tensively  in  recent  years.  Cheeseman,  Kanefsky  and  Taylor  [1]  first  reported  the 
phase  transition  as  the  interface  between  a  region  where  almost  all  problems 
have  many  solutions  and  are  relatively  easy  to  solve,  and  a  region  where  almost 
all  problems  have  no  solution  and  their  insolubility  is  relatively  easy  to  prove. 
In  this  intervening  region,  the  probability  of  problem  solubility  falls  from  close 
to  1  to  close  to  0,  and  the  median  cost  of  searching  these  problems  is  highest, 
reaching  a  peak  at  the  crossover  point  [2]  where  50%  of  problems  are  soluble.  In 
binary  constraint  satisfaction  problems  (CSPs)  with  a  fixed  size  and  constraint 
density,  the  phase  transition  occurs  as  the  tightness  of  the  constraints  increases, 
from  where  problems  are  under-constrained  to  where  they  are  over-constrained. 

In  looking  at  establishing  levels  of  consistency  in  CSPs,  reported  fully  in  [3], 
we  observe  what  appears  to  be  phase  transition  behaviour  exactly  analogous  to 
that  found  for  full  search.  Arc  consistency  (AC)  and  path  consistency  (PC)  were 
established  in  samples  of  randomly  generated  CSPs  of  fixed  size  and  varying 
constraint  density  and  tightness,  using  the  AC3  and  PC2  algorithms  [4]  respec¬ 
tively.  For  problems  of  fixed  size  and  constraint  density,  a  peak  in  the  average 
cost  of  establishing  consistency  occurs  between  a  region  of  constraint  tightness 
where  the  particular  level  of  consistency  can  be  established  in  all  problems,  and 
achieving  this  is  easy,  and  a  region  where  the  particular  level  of  consistency  can¬ 
not  be  established  in  any  problem  (showing  each  to  be  insoluble),  and  achieving 
this  is  easy.  In  the  intervening  region,  a  proportion  of  problems  can  be  made 
consistent,  and  it  is  observed  that  the  peak  in  cost  approximately  coincides  with 
the  point  where  this  is  true  for  50%  of  problems. 

The  average  costs  of  AC3  and  PC2  were  investigated  in  terms  of  both  con¬ 
sistency  checking  effort  and  the  number  of  arc-  or  path-inconsistent  values,  or 
“nogoods” ,  pruned  from  variable  domains.  Both  algorithms  were  set  to  termi¬ 
nate  upon  the  first  wipe-out  of  an  entire  variable  domain,  when  it  is  clear  that 
a  problem  is  insoluble.  In  terms  of  domain  pruning,  as  we  move  into  the  AC  or 
PC  phase  transitions,  AC3/PC2  starts  to  find  a  number  of  nogoods  in  average 
problems,  although  not  enough  to  cause  the  complete  wipe-out  of  any  variable 
domain.  At  the  peak  in  cost,  domain  wipe-out  occurs  for  about  50%  of  prob¬ 
lems,  allowing  the  algorithm  to  terminate  early.  Moving  into  the  inconsistent 
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region,  the  number  of  nogood  values  in  problems  is  still  increasing,  but  causes 
domain  wipe-out  to  occur  more  quickly.  This  results  in  earlier  termination  of 
the  algorithm,  and  thus  a  fall  in  the  average  domain  pruning  cost.  The  pattern 
of  behaviour  for  consistency  checking  through  the  AC  and  PC  phase  transitions 
matches  that  of  domain  pruning,  with  the  peak  in  effort  coinciding  with  the  peak 
in  nogoods  pruned  by  the  algorithm. 

In  reporting  the  phase  transition  behaviour  associated  with  establishing  AC 
and  PC  in  CSPs,  we  make  an  analogy  with  the  phase  transition  behaviour  ob¬ 
served  when  finding  a  single  solution  to  the  same  problems.  At  first  glance, 
it  appears  that  the  analogy  should  in  fact  be  made  with  respect  to  finding  all 
solutions  to  problems.  However,  if  we  consider  establishing  a  certain  level  of  con¬ 
sistency  in  a  problem  as  performing  the  minimal  amount  of  work  necessary  to 
prove  that  the  problem  can  possess  such  consistency,  then  the  validity  of  the  anal¬ 
ogy  becomes  clear:  when  attempting  to  establish  ^-consistency  in  an  n-variable 
problem,  where  A:  <  n,  all  paths  of  length  k  must  be  made  consistent  and  the 
effects  of  the  removal  of  inconsistent  assignments  must  be  propagated  around 
the  other  paths;  but  when  attempting  to  establish  n-consistency,  only  one  path 
of  length  n  exists  (the  variable  ordering  is  immaterial),  and  the  discovery  of  one 
consistent  set  of  labels  for  the  variables  in  the  form  of  a  solution  is  sufficient  to 
prove  that  the  problem  can  be  made  n-consistent .  The  existence  of  phase  tran¬ 
sition  behaviour  in  establishing  AC  and  PC  suggests  that  similar  behaviour  will 
be  found  when  establishing  the  existence  of  higher  levels  of  consistency  in  CSPs, 
up  to  that  of  establishing  full  consistency  by  finding  a  solution. 

A  practical  application  of  the  AC  and  PC  phase  transitions  is  that  they 
clearly  indicate  the  regions  where  establishing  consistency  has  a  domain  pruning 
effect.  While  there  is  very  little  effect  on  average  in  the  under-constrained  region, 
establishing  consistency  in  the  over-constrained  region  proves  the  insolubility  of 
many  problems  without  the  need  for  further  search.  Thus,  if  the  location  of 
consistency  phase  transitions  can  be  accurately  predicted  for  CSPs,  as  that  for 
full  consistency  can  [5],  then  we  can  easily  determine  the  suitability  and  likely 
cost  of  establishing  consistency  as  a  preprocessing  step  to  full  search. 
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1  Introduction 

At  SimCon  we  developed  the  SimCon  Constraint  Object  Oriented  Programmmg 
library  (SCOOP)  to  be  used  for  a  large  planning  application  and  to  be  integrated  with 
AutoMod,  a  discrete  event  simulation  tool.  We  outline  SCOOP’s  main  features  and 
we  describe  how  we  combined  constraint  programming  with  discrete  event  simulation. 
We  conclude  with  some  interesting  experiences. 

2  SCOOP 

SCOOP  includes  several  classes  of  constrained  variables,  built-in  constraints,  control 
structures  and  search  strategies,  and  it  provides  specific  support  for  debugging.  The 
library  is  easily  extendible  and  adaptable  to  fit  application  specific  needs. 

All  constrained  variables  in  SCOOP  are  modeled  as  C++  objects  and  we  support  finite 
domains  containing  integers,  strings,  booleans  or  pointers  (to  objects).  All  constrained 
variable  objects  contain  an  info-object.  This  allows  to  add  semantic  information  to  the 
constrained  variable,  which  may  be  consulted  during  search  or  propagation.  This  has 
proven  to  be  very  usefiil  for  writing  problem  specific  constraints  or  search  strategies. 
Constraints  between  constrained  variables  are  objects  as  well  in  our  library.  How  a 
constraint  should  react  upon  changes  of  the  domain  of  one  of  its  constrained  variables 
is  defined  by  demons.  A  demon  is  triggered  in  reaction  to  a  propagation  event  (e.g. 
minimum  of  the  domain  has  increased).  The  user  can  write  its  own  constraints  by 
combining  built-in  constraints  or  by  defining  a  new  constraint  class  (deriving  from  the 
C  Constr  virtual  base  class)  for  which  propagation  demons  have  to  be  defined.  Each 
constraint  can  be  put  to  sleep.  This  implies  that  the  constraint  will  no  longer  be  used  in 
the  program  and  it  is  completely  ignored  until  you  explicitly  wake  it  up.  The  ability  to 
deactivate  constraints  is  very  powerful  to  answer  what-if  questions. 

By  default  all  constraints  are  treated  equally  important.  This  means  that  all  actions 
resulting  upon  propagation  events  are  handled  on  a  first  come  first  served  basis. 
However,  you  can  attach  priorities  to  constraints  to  change  this  behavior.  In  that  case 
actions  with  the  highest  priority  are  treated  first. 

SCOOP  incorporates  a  number  of  general  tree  search  strategies  which  are  easily 
extendible.  The  backtracking  mechanism  is  open,  allowing  the  user  to  introduce  its 
own  datastructures  that  should  be  backtracked  upon.  Currently  we  support 
chronological  backtracking  and  backtracking  to  a  ‘labeled’  choicepoint. 

In  order  to  increase  the  usefulness  of  SCOOP  we  have  introduced  several  user  exits  in 
the  library.  These  are  places  where  the  user  may  introduce  its  ovm  fimctions  to 
enhance  or  modify  the  built-in  behavior  of  the  library. 

Another  feature  of  SCOOP  is  the  ability  to  reset  the  domain  of  a  constrained  vmable. 
This  is  usefiil  for  an  optimization  method  that  modifies  a  solution  by  perturbing  it. 
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SCOOP  includes  classes  implementing  time- intervals  and  constraints  between  them 
(e.g.  precedes,  meets  with,  ...).  To  efficiently  tackle  disjunctive  constraints  on 
intervals  we  apply  an  edge  finding  technique. 

3  Combining  Constraint  Programming  with  Simulation 

Discrete  event  simulation  is  mostly  used  to  help  determine  operational  control  rules 
and  to  accurately  answer  system  design  questions.  Simulation  based  scheduling  is 
especially  useful  in  large  discrete  manufacturing  environments  where  other 
technologies  are  not  applicable.  Simulation  is  fast  but  it  does  not  allow  backtracking. 
This  makes  addressing  all  constraints  difficult  in  a  complex  environment. 

We  tried  several  mechanisms  to  combine  constraint  programming  with  simulation.  In 
a  sequential  approach  we  first  apply  simulation  followed  by  constraint  programming. 
In  fact  what  we  do  is  we  reduce  the  complexity  of  the  problem  by  using  the  results  of 
the  simulation  to  turn  preferences  into  hard  constraints,  derive  additional  constraints, 
derive  values  for  some  constrained  variables,  or  parametrise  the  search  strategy. 

We  have  already  applied  discrete  event  simulation  after  we  have  found  a  solution  with 
constraint  programming,  but  the  tasks  of  simulation  were  limited  to  gather  some 
statistical  data  about  the  solution  and  to  offer  the  user  a  3D  graphical  animation. 
Another  approach  is  that  both  technologies  could  be  used  on  a  different  level  of 
abstraction.  You  can  e.g.  solve  the  problem  on  a  higher  level  of  abstraction  using 
constraint  programming  and  detail  the  solution  on  a  lower  level  using  simulation. 
Another  general  integration  approach  we  have  tried  is  to  decompose  a  problem  and 
apply  the  most  appropriate  technology  to  each  subproblem. 

4  Experiences 

In  this  section  we  will  present  some  of  our  experiences  with  solving  problems  with 
constraint  programming.  A  typical  difficulty  are  soft  constraints  representing 
preferences.  A  way  of  tackling  such  a  problem  is  by  modeling  it  at  as  a  Minimal 
Violation  Problem.  If  you  attach  costs  to  violations  of  preferences,  you  can 
reformulate  your  initial  problem  as  a  minimization  problem  of  this  cost  function. 
Unfortunately,  complex  cost  constraints  usually  lead  to  insufficient  propagation  for 
use  in  a  branch  &  bound  optimization.  Another  way  of  dealing  with  soft  constraints  is 
by  writing  specific  variable  and  value  ordering  heuristics  that  take  these  preferences 
into  account.  This  method  is  only  useful  if  the  number  of  soft  constraints  is  limited 
and  if  the  hard  constraints  of  the  problem  propagate  well  enough  to  quickly  reduce  the 
search  space.  We  have  also  dealt  with  soft  constraints  in  a  problem  by  partially 
converting  them  into  hard  constraints.  We  verified  whether  this  was  justified  by 
performing  some  preprocessing  simulations. 

A  major  difficulty  we  encountered  with  constraint  programming  is  to  explain  why  no 
solution  for  a  problem  can  be  found.  The  user  wants  some  feedback  to  identify  how  he 
should  adapt  the  input  data  or  which  constraints  should  be  relaxed.  Depending  on  the 
problem  you  may  add  more  preprocessing  of  data  to  eliminate  the  more  obvious  cases 
that  would  lead  to  inconsistencies  or  to  indicate  potential  problem  constraints. 
Another  approach  is  to  let  the  user  interactively  set  or  relax  some  constraints.  This 
way  you  give  the  user  some  control  over  the  construction  of  a  solution.  Usually  the 
order  in  which  he  sets  constraints  corresponds  to  their  relative  importance.  If  a 
definitive  failure  occurs  it  relates  to  the  constraint  that  is  last  introduced. 
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1  Introduction 

Distributed  CSPs  have  recently  drawn  the  attention  of  the  researchers  in  Multi- 
Agent  Systems.  Those  problems  can  be  considered  to  be  CSPs  where  variables 
and  constraints  axe  distributed  among  multiple  agents. 

We  have  presented  the  hill-climbing  based  distributed  constraint  satisfac¬ 
tion  algorithm[l],  in  which  agents  perform  hill-climbing  mutually  excluding  each 
other,  and  form  coalitions  to  make  violated  constraints  consistent  when  they 
meet  dead-ends.  This  algorithm,  called  Hill-Climhing  with  Local  Consistency 
(HCLC),  is  characterized  by  the  local  consistency  procedure  which  is  invoked  by 
an  agent  at  a  dead-end,  and  the  negotiation  procedure  for  agents  to  mutually 
exclude  their  hill-climbing.  In  the  previous  paper,  we  mainly  discussed  the  local 
consistency  procedure  in  detail,  but  not  fully  discussed  the  property  and  eifect 
of  the  negotiation  procedure. 

We  discuss  two  things  about  the  negotiation  procedure  in  this  paper.  One  is 
the  property  that  the  action  of  neighboring  agents  is  actually  suppressed  through 
the  procedure.  The  other  is  experimental  results  which  shows  the  effect  of  nego¬ 
tiation.  Added  to  these,  we  propose  a  simple  strategy,  called  weight  adjusting, 
for  improving  the  performance  of  the  algorithm. 

2  Framework 

A  Distributed  CSP  is  defined  as  a  problem  where  each  agent  has  variables, 
domains,  and  constraints.  The  goal  of  each  agent  is  to  find  one  set  of  assignments 
to  its  variables  that  satisfies  all  its  constraints.  Some  of  the  constraints,  however, 
are  defined  over  sets  of  variables  including  other  agents’.  Accordingly,  agents 
must  communicate  each  other  to  achieve  their  goals. 

We  have  developed  a  hill-climbing  method  for  solving  Distributed  CSPs.  In 
this  method,  all  agents  try  to  change  their  assignments  in  order  to  reduce  costs, 
which  is  the  numbers  of  violated  constraints  of  their  own.  However  a  change  of 
assignments  by  one  agent  affects  the  state  spaces  of  its  neighbors.  This  means 
that  if  we  permitted  all  agents  to  change  their  assignments  as  they  like,  there 
might  be  a  situation  where  an  agent  repeatedly  fails  to  reduce  a  cost,  and  not 
really  reaching  a  solution  as  a  result.  To  eliminate  such  a  situation,  we  have 
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developed  a  coordination  mechanism  that  enables  an  agent  to  explicitly  suppress 
the  neighbors’  action  of  hill-climbing.  We  refer  to  the  procedure  as  negotiation. 
For  lack  of  space  we  omit  the  description  of  the  negotiation  procedure,  and  just 
give  the  following  theorem  without  proof. 

Theorem  1.  Given  finite  delays  in  delivering  messages  and  FIFO  (First-In-First- 
Out)  message  passing,  the  negotiation  procedure  realizes  mutual  exclusion  of  hill¬ 
climbing  among  neighboring  agents. 

3  Experimental  Results 

To  see  an  effect  of  the  negotiation  procedure,  we  made  experiments  to  compare 
HCLC  with  Asynchronous  Weak- Commitment  search  (AWC)[2],  which  does  not 
involve  a  mechanism  for  explicit  coordination.  Two  classes  of  instances,  called 
sparsely- connected  graphs  and  densely- connected  graphs,  of  distributed  3-coloring 
problems  were  generated  and  solved  with  both  algorithms  on  a  simulator.  The 
number  of  variables  is  ranged  over  30,  50,  and  70  for  each  class  of  instances. 

Results  of  our  experiments  are  summarized  as  follows:  (1)  AWC  works  better 
in  terms  of  the  estimated  time  complexity  in  all  cases;  (2)  for  loads  of  constraint 
checks  in  agents,  HCLC  is  better  on  densely-connected  graphs.  It’s  also  better  on 
small-sized  sparsely-connected  graphs;  (3)  HCLC  uses  less  messages  in  all  cases; 
and  (4)  a  smaller  amount  of  repairing  assignments  is  required  for  HCLC. 

Our  next  concern  is  to  reduce  the  amount  of  constraint  checks  which  is  done 
by  HCLC  for  large-sized  sparsely-connected  graphs.  To  do  this,  we  define  a 
weight,  a  positive  integer,  for  each  constraint  in  one  agent  such  that  its  value 
corresponds  to  the  number  of  variables  in  another  agent  who  shares  the  con¬ 
straint;  and  measure  a  cost  as  the  sum  of  weights  of  violated  constraints.  We 
call  this  strategy  weight  adjusting.  Results  of  our  experiments  are  encouraging, 
because  we  got  at  least  four  times  improvement. 

4  Conclusions 

We  mainly  discussed  the  property  and  effect  of  the  negotiation  procedure  used  in 
HCLC,  and  proposed  a  new  strategy  for  improving  its  performance  on  a  certain 
class  of  distributed  3-coloring  problems.  Our  future  work  is  testing  the  algorithm 
on  other  classes  of  the  problem  and  other  kinds  of  problems. 
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Abstract.  Many  practical  problems  are  overconstrained,  i.e.,  it  is  im¬ 
possible  to  find  a  solution  that  would  satisfy  all  constraints.  In  such 
situations  one  may  try  to  find  solutions  that  satisfy  as  many  constraints 
as  possible,  see  [1].  In  a  more  general  approach  one  can  assign  some 
weights  (utilities)  to  constraints  and  then  look  for  solutions  that  maxi¬ 
mize  the  sum  of  weights  of  satisfied  constraints.  The  problem  of  finding 
such  solutions,  the  Maximum  Utility  Problem,  MUP,  is  the  subject  of  this 
paper.  We  present  a  number  of  approximate  algorithms  for  solving  this 
problem.  Numerous  tests  have  demonstrated  high  performance  of  these 
algorithms:  approximated  solutions  are,  on  average,  about  2%  worse  than 
optimal  ones.  Our  algorithms  are  local  search  procedures  that  iteratively 
improve  the  current  solution  by  modifying  it.  As  a  heuristic  that  guides 
the  whole  process  we  use  the  expected  utility  value  of  a  solution  that 
can  be  obtained  from  the  current  (partial)  solution  by  extending  it  to  a 
complete  one  at  random.  This  heuristic  has  been  motivated  by  an  old 
algorithm  for  solving  the  k-MAXGSAT  problem,  which  was  proposed  by 
Johnson,  [2]. 


1  Main  Result 

We  will  start  with  some  definitions.  A  Maximum  Utility  Problem,  MUP ,  is  a  tu¬ 
ple  P  =  <  V,  D,  C,  U  >  given  by:  a  set  of  variables,  V  —  {xi,  X2,  •  •  • ,  oon}, 
a  collection  of  finite  domains  for  these  variables  D  =  {Di,  Dn},  a  set 
of  constraints  C  -  {Ci,  •  ■  • ,  and  a  set  of  corresponding  utilities  U  = 
{ui,  •  •  ■ ,  where  each  utility  is  a  real  number. 

A  solution  of  P  is  an  arbitrary  (complete)  valuation  v  of  involved  variables 
that  maximizes  the  total  utility  of  constraints:  u{v)  =  where 

x(Ci,  ?;)  =  1  if  Ci  is  satisfied  by  v;  0  otherwise. 

Let  v  be  a  partial  valuation  and  let  p{v,  Ci)  denote  the  probability  of  sat¬ 
isfying  Ci  by  a  randomly  selected  complete  extension  of  v  (all  extensions  have 
the  same  chance).  Then  the  expected  utility  of  v,  e{v),  is  defined  as  e{v)  = 

The  expected  utility  has  the  following,  very  useful,  property: 

Theorem  1.  Let  v  be  a  partial  valuation  and  x  an  arbitrary  variable  which  is  not 
instantiated  by  v.  Then  there  exists  a  value  d  for  x  such  that  e{v\xld\)  >  e{v). 
Consequently,  v  can  be  extended  to  a  complete  valuation  w  such  that  u{w)  >  e(v). 
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2  Experiments 

We  developed  a  number  of  algorithms  based  on  the  results  of  section  1.  SATl 
chooses  the  best  variable/ value  pair  to  instantiate  in  each  step  (a  variable/value 
pair  which  maximizes  the  expected  utility).  SAT2  and  SAT3  do  the  same  but 
they  instantiate  2  resp.  3  variable/ value  pairs  in  each  step.  SATl-\-,  ESATl 
and  LSATl  are  all  based  on  SATl.  SAT1+  tries  to  improve  a  solution  found 
by  SATl  by  revising  old  assignments.  ESATl  works  similarly  to  SATl,  but 
instead  of  focusing  only  on  uninstantiated  variables,  all  variables  are  taken  into 
account.  Finally,  LSATl  extends  SATl  by  constantly  revising  the  generated 
fragment  of  a  solution.  See  [3]  for  a  more  detailed  description. 

The  algorithms  were  tested  on  4725  problems  which  were  optimally  solved 
by  branch  and  bound.  The  problem  parameters  were  chosen  as  follows:  number 
of  variables  (10,  15,  20),  maximum  utility  of  a  constraint  (1,  10,  20),  density  of 
the  constraint  graph  (0.2,  0.3,  •  ■  •  ,  0.8)  and  the  tightness  (0.2,  0.5,  0.8).  The 
domain  size  remained  constant  (4).  For  each  combination  of  above  parameters 
25  problems  were  generated  and  solved.  The  results  of  the  tests  can  be  found  in 
table  1. 


Table  1.  Compressed  results  of  the  approximate  algorithms 


Algorithm 

mean  error  (%) 

std,  deviation 

no.  constraint  checks 

no.  optimal 

SATl 

2.72 

3.41 

2.38e-f-08 

1539 

SAT2 

2.00 

2.67 

2.87e+09 

1830 

SAT3 

1.65 

2.33 

3.61e-fl0 

2004 

SATl-f 

2.10 

3.03 

2.61e-f-08 

1905 

ESATl 

2.08 

3.02 

5.06e-|-08 

1914 

LSATl 

1.86 

2.86 

5.15e-}-08 

2051 

3  Conclusions  and  Further  Research 

All  algorithms  perform  surprisingly  well:  average  approximation  error  varies  from 
2.72%  for  SATl  to  1.86%  for  LSATl,  the  best  algorithm  with  complexity  com¬ 
parable  to  that  of  SATl.  Further  research  will  include  testing  these  algorithms 
on  large  problems  and  comparing  them  with  algorithms  like  simulated  annealing 
and  genetic  algorithms. 
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Abstract,  This  work  presents  a  gener£il  meta  CLP  architecture  which 
can  be  built  by  adding  CLP  solvers  cis  meta  levels  reasoning  on  the  con¬ 
straints  of  the  underlying  object  system.  We  propose  two  specializations 
on  finite  domains:  the  firet  concerns  the  possibility  of  embedding  the  ca¬ 
pability  of  performing  qualitative  reasoning  in  a  CLP  framework,  while 
the  second  concerns  a  multi-level  architecture  for  obtmning  different  de¬ 
grees  of  consistency  by  using  weaker  algorithms. 


Constraint  Logic  Programming  (CLP)  [1,  2]  is  a  class  of  programming  lan¬ 
guages  combining  the  LP  declarative  semantics  with  the  efficiency  of  constraint 
solving.  These  features  have  lead  CLP  to  receive  increasing  attention  in  recent 
years  from  both  a  theoretical  and  a  practical  point  of  view. 

In  this  work,  we  focus  our  attention  on  CLP  on  finite  domains,  CLP(FD),  an 
expressive  and  flexible  language  for  solving  the  so  called  Constraint  Satisfaction 
Problems  (CSP).  However,  some  applications  in  CSP  field  need  a  greater  flex¬ 
ibility  than  that  provided  by  usual  CLP(FD)  solvers.  For  example,  in  the  field 
of  temporal  reasoning,  some  applications  require  a  more  powerful  propagation 
algorithm  than  arc-consistency,  like  path  or  4-consistency.  CLP(FD)  constraint 
solvers  have  an  embedded  propagation  algorithm  (namely  arc-consistency)  which 
cannot  be  changed  by  the  user  in  accordance  with  the  application  to  be  solved. 

Furthermore,  temporal  reasoning  problems  usually  require  to  reason  also  on 
the  so  called  qualitative  constraints  [4].  One  of  the  problems  faced  in  a  tempo¬ 
ral  framework  is  to  find  the  minimal  network,  i.e.,  a  constraint  graph  where  all 
constraints  are  the  tightest.  For  example,  if  X  <Y,Y  <  Z  and  X  <  Z  then 
we  want  to  infer  the  tightest  constraint  X  <  Z.  Moreover,  in  usual  constraint 
solvers  variable  domains  are  reduced  in  accordance  with  constraints,  but  it  never 
happens  that  constraints  are  reduced  according  to  variable  values.  For  example, 
if  X  ranges  on  [1..5]  and  Y  on  [7..  16],  and  the  constraint  linking  the  two  vari¬ 
ables  is  X  <  y,  the  constraint  =  is  no  longer  entailed  by  variables  values.  We 
would  like  to  propagate  the  constraint  that  become  X  <Y.  This  propagation  is 
not  provided  by  usual  constraint  solvers  because  they  do  not  reason  on  passive 
constraints  (e.g.,  binary  constraints)  in  the  sense  that  these  constraints  are  used 
for  the  propagation  but  are  never  changed  during  the  computation. 
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In  this  paper,  we  propose  a  solution  to  both  the  above  mentioned  problems 
via  a  meta  CLP  architecture  which  can  be  specialized  on  particular  domains 
(a  longer  version  of  this  paper  can  be  found  in  [3]).  In  the  meta  architecture, 
each  level  contains  a  representation  of  the  object-level  constraint  store  in  its 
domain.  For  example,  a  CLP(FD)  solver  works  on  constraints  like  X  ::  [1..10], 
y  ::  [5..10],  X  <  y.  Unary  constraints  (active  constraints)  are  modified  by  the 
solver  (by  reducing  domain  values  through  constraint  propagation)  according  to 
the  binary  constraint.  The  binary  relation  (passive  constraint),  instead,  is  never 
changed  during  the  computation.  If  we  want  to  reason  on  this  relation,  we  can 
represent  it  explicitly  or  implicitly  in  the  meta  constraint  store. 

We  propose  two  specializations  of  a  meta  CLP(FD)  architecture:  the  first 
concerns  the  possibility  of  embedding  in  a  CLP  framework  the  capability  of 
performing  qualitative  reasoning.  We  implicitly  represent  the  relation  X  <  Y 
via  a  meta  variable  whose  domain  contains  the  constraint  symbols  linking  the 
object  level  variables  X  and  Y.  We  can  define  meta-operations  on  constraint 
symbols  (like  union,  composition  and  intersection)  and  meta-constraints  (like 
the  tighter  relation).  In  this  way,  we  can  find  the  minimal  network  and  reason 
on  constraints  of  the  underlying  system  thus  performing  qualitative  reasoning. 

The  second  specialization  is  a  multi-level  architecture  which  reaches  whatever 
degree  of  consistency  by  composing  weaker  algorithms.  Each  level  is  a  CLP(FD) 
solver  adopting  an  arc-consistency  propagation  algorithm.  By  adding  further  lev¬ 
els,  we  are  able  to  incrementally  increase  the  consistency  degree  achieved  in  the 
whole  architecture  by  using  weaker  algorithms  without  changing  the  structure 
of  each  constraint  solver.  We  explicitly  represent  in  the  meta-level  the  relation 
X  <  y  in  terms  of  consistent  couples  allowed  for  the  object  level  variables  X 
and  y.  Therefore,  the  meta-level  variables  range  on  the  consistent  couples.  By 
performing  an  arc-consistency  on  couples  of  values  we  achieve  path  consistency 
on  the  underlying  level  solver.  In  general,  by  adding  a  meta-level  solver  perform¬ 
ing  a  Ar-consistency,  we  incrementally  increase  the  consistency  achieved  by  the 
overall  architecture  by  a  factor  k  ~  1. 

In  our  architecture,  we  combine  the  efficiency  and  declarativeness  of  CLP 
with  the  flexibility  and  modularity  of  meta  architectures.  We  are  currently  in¬ 
vestigating  the  possibility  of  performing  the  amalgamation  of  different  levels  in 
a  single  constraint  solver  thus  also  increasing  the  expressive  power  of  CLP(FD). 
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1  Introduction 

In  this  paper,  we  compare  definitions  of  n-ary  consistency  introduced  by  Dechter 
&  van  Beek  [1]  and  Jegou  [2].  We  show  the  duality  between  relational- Ar-consisten- 
cy  and  hyper-Ar-consistency.  The  algorithm  CBT:  Constraint-based  BackTrack- 
ing,  results  from  this  comparison  study.  It  is  a  dual  approach  with  respect  to  the 
standard  backtrack  algorithm  (variable- based  BT). 

2  Definitions  and  Notations 

Definition!  [3].  A  Constraint  Satisfaction  Problem  7^  is  a  tuple  V  — 
(X,  D,  C,  R).  X  =  {Xi,  ...j  X„}  is  a  set  of  n  variables.  D  =  {Di, Dn)  is  a  set 
of  n  domains.  Each  Di  is  associated  with  a  X,-.  C  =  {Ci, Cm]  is  a  set  of  m 
constraints.  Each  constraint  Ci  is  defined  by  a  set  of  variables  {Xj^,  ...,X,-^J  C 
X.  {Cl,...,  Cm}  is  called  the  scheme  of  C.  i?  =  {Ri,  ...,Rm}  is  a  set  of  m 
relations.  Each  relation  Ri  defines  a  set  of  I  Ui-tuples  on  Di^  x ...  x  Di^  compatible 
w.r.t.  Ci. 

Definition  2. 

-  A  CSP  V  =  (X,  D,  C,  R)  is  relational-Ar-consistent  [1]  iff 

vci, C/i-i  e C,  Vx €  nj,f 

-  A  CSP  V  =  (X,  D,  C,  R)  is  hyper-A:-*consistent  [2]  iff 

vCi, . . . , Cfc  e  c,  (xik7‘-*“')[(u^7'-'‘"')  n a]  =  n  c^]. 


We  recall  the  notations:  projection  is  denoted  Ri[A],  join  is  denoted  Ri  \xi  R2, 
and  p{A)  is  the  set  of  all  consistent  instanciations  of  the  variables  in  A. 

3  Comparative  Study  of  N-ary  Consistencies 

Theorem  3.  hyper-k -consistency  ^  relational-k- consistency. 
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Theorem  4*  relational-k -consistency  ^  hyper-k- consistency. 

Relation al-Ar-consistency  filters  domains.  Conversely,  hyper- ^-consistency  fil¬ 
ters  all  the  relations,  but  leaves  domains  unchanged.  We  claim  that  relational- 
consistency  and  hyper-consistency  are  not  equivalent  but  dual.  This  duality  in¬ 
duces  a  new  algorithm,  named  CBT,  dual  of  the  variable-based  backtrak  algo¬ 
rithm.  CBT  exploits  filtering  effects  of  hyper- ^-consistency. 

4  Constraint-based  Backtracking:  CBT 

The  general  principle  of  CBT  is  to  change  instanciated  objects  in  one  step  of  a 
backtrack  procedure:  each  time  the  procedure  CBT  is  called,  a  set  of  variables 
(instead  of  one  in  procedure  BT)  is  instanciated.  The  set  corresponds  to  the 
scheme  of  one  constraint.  Actually,  CBT  is  a  dual  version  of  BT. 

Theorems.  The  complexity  of  CBT  is  in  0(1^).  Recall  that  m  =  \C\:  the 
number  of  constraints  in  the  network,  and  I  is  the  maximum  number  of  tuples 
in  a  relation. 

—  CBT  will  be  trivialy  faster  than  BT  when  n  >  m. 

—  The  ratio  ^  seems  to  be  a  good  parameter  to  choose  between  CBT  and  BT. 

-  CBT  is  faster  than  BT  when  m. 

-  As  relational- /^-consistency  filters  CSPs  for  variable-based  backtracking,  hy- 
per-A:-consistency  filters  CSPs  for  constraint-based  backtracking. 

Corollary  6.  Let  V  =  {X,  D,  C,  R)  be  an  hyper-k -consistent  CSP,  and  let  o{C) 
be  an  order  on  C.  If  the  width  of  o{C)  on  the  dual  graph  ofV  is  smaller  than  k, 
than  CBT  is  backtrack-free  relatively  to  o{C}. 

5  Perspectives 

New  methods  should  be  developed  to  enhance  Constraint- based  BackTracking. 
The  main  directions  to  reduce  cost  of  CBT  we  are  looking  for,  are:  order  heuris¬ 
tics,  looking-back  methods,  looking-ahead  methods  and  filtering  methods. 
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This  work  takes  place  in  the  field  of  Constraint  Logic  Programming  based 
on  the  Interval  Propagation  Approach  [3,  6,  7,  9].  The  Fix  Point  Algorithm  (the 
kernel  part  of  Propagation  techniques)  is  based  on  local  consistency  through  net¬ 
works  of  relations.  These  relations  belong  to  a  set  of  particular  relations,  known 
by  the  system  :  the  primary  relations.  Complex  constraints  are  systematically 
decomposed  into  primary  relations.  The  major  weakness  of  this  technique  is  that 
only  local  reasoning  at  the  level  of  primary  relations  is  done. 

When  this  level  of  consistency  is  too  weak  (non-convex  relations,  multiple 
occurrence  variables,  ...),  it  should  be  very  interesting  to  have  the  ability  to 
tackle  some  delicate  constraints  globally  (i.e.  handling  such  complex  constraints 
as  a  whole).  We  introduce  here  a  very  simple  methodology  to  cope  with  this 
problem:  it  only  relies  on  basic  Interval  techniques  (Fix  Point  and  Enumeration) 
and  the  global  semantics  of  the  solver  is  unchanged  (i.e.  the  characterization  of 
the  solutions  is  the  same).  The  main  idea  can  be  stated  as  follows:  if  we  want 
to  handle  a  complex  constraint  (C)  globally,  then  a  specific  enumeration  will 
be  called  to  narrow  the  domains  of  the  variables  of  (C)  each  time  the  projec¬ 
tions  of  this  constraint  have  to  be  re-evaluated  (the  basic  step  of  the  Fix  Point 
Algorithm).  The  Fix  Point  Algorithm  and  the  Enumeration  are,  in  that  case, 
mutually  recursive,  contrarily  to  usual  enumeration  techniques. 

Let’s  show  now  some  experimental  results  we  can  obtain  solving  Broyden- 
Banded  equation  systems,  in  Prolog  IV  [1],  using  this  technique.  The  results  will 
be  compared  to  Newton’s  results  (a  sophisticated  CLP(Intervals)  language  using 
formal  handling,  Newton  derivative  methods,  and  a  specific  consistency  notion: 
the  box- consistency  [2]  ). 

The  Broyden* Banded  problem,  made  of  polynomial  equations  of  the  third  de¬ 
gree,  is  a  classical  benchmark  for  testing  numerical  solvers.  Here  is  the  definition 
of  the  problem  Hn  {n  equations,  n  variables)  [4,  5]  : 

Vi  e  l..n,  Xi{2  -t-  bx^)  +  I  -  J2jeJi  ==  ^ 

with  Jizz{j  eIN\j^  i,  MAX{1,  i-b)<j<  MIN{n,  i  1)} 

*  This  work  is  done  under  a  CIFRE  contract  with  ProloglA,  and  an  ANVAR  research 
contract. 

**  Computer  Science  Professor,  Universite  de  la  Mediterranee. 
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We  solve  this  problem  applying  complex  constraint  handling  on  the  equations 
yi  =  Xi(2  +  and  Zi  =  Xi(l  +  Xi)  which  cannot  be  solved  efficiently  by  a 
simple  decomposition  into  primary  relations  because  xi  occurs  twice  in  both 
expressions.  Execution  times  (in  seconds),  that  are  given  in  the  following  table, 
were  obtained  on  a  PC  Pentium  (Prolog  IV)  and  a  Sun  SS  10/20  (Newton). 


n 

Prolog  IV 
(Normal) 

Prolog  IV 
(With  complex  con¬ 
straint  handling) 

Newton 

5 

2.9 

4.4 

1.2 

10 

10.9 

13.4 

8.8 

20 

279.8 

29.9 

25.9 

40 

>  10,688 

66.6 

61.6 

80 

- 

134.4 

127.7 

160 

- 

274.7 

264.6 

Execution  times  of  Prolog  IV  (with  complex  constraint  handling)  and  Newton 
are  similar,  and  seem  to  be  linear,  contrarily  to  the  usual  technique  which  is 
clearly  exponential. 

The  methodology  we  have  briefly  presented  here  is  useful  to  tackle  problems 
like  Broyden-Banded  equation  systems.  Solving  this  kind  of  problems,  the  effi¬ 
ciency  is  comparable  to  more  sophisticated  systems  like  Newton.  Considering  the 
fact  that  our  system  only  relies  on  a  global  treatment  of  complex  expressions, 
done  in  a  very  straightforward  manner,  we  can  conclude  that  this  mechanism, 
and  only  this  mechanism,  is  fundamental  solving  such  problems.  This  methodol¬ 
ogy  can  also  be  seen  in  another  way  :  it  can  easily  be  used  to  design  prototypes 
that  gives  the  ability  to  test  the  behaviour  of  global  constraints  without  any 
implementation  effort. 
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Complete  algorithms  for  solving  propositional  satisfiability  fall  into  two  main 
classes:  backtracking  search  (e.g.,  the  Davis-Putnam  Procedure  [1])  and  resolu¬ 
tion  (e.g.,  the  original  Davis-Putnam  Algorithm  [2]  and  Directional  Resolution 
[4]).  Backtracking  may  be  viewed  as  a  systematic  “guessing”  of  variable  assign¬ 
ments,  while  resolution  is  inferring,  or  “thinking”.  Experimental  results  show 
that  “pure  guessing”  or  “pure  thinking”  might  be  inefficient.  We  propose  an  ap¬ 
proach  that  combines  both  techniques  and  yields  a  family  of  hybrid  algorithms, 
parameterized  by  a  bound  on  the  “effective”  amount  of  resolution  allowed.  The 
idea  is  to  divide  the  set  of  propositional  variables  into  two  classes:  conditioning 
variables,  which  are  assigned  truth  values,  and  resolution  variables,  which  are  re¬ 
solved  upon.  We  report  on  preliminary  experimental  results  demonstrating  that 
on  certain  classes  of  problems  hybrid  algorithms  are  more  efficient  than  either 
of  their  components  in  isolation. 

The  well-known  Davis-Putnam  Procedure  (DP)  is  a  backtracking  algorithm 
enhanced  by  unit  resolution  at  each  level  of  the  search.  Directional  Resolution 
(DR) [4]  is  a  variable-elimination  algorithm  similar  to  adaptive- consistency  for 
constraint  satisfaction.  Its  worst-case  time  and  space  complexity  is  exponential 
in  induced  widths  u;*,  of  the  interaction  graph  of  a  propositional  theory.  The  time 
complexity  of  DP  is  worst-c2Lse  exponential  in  the  number  of  variables,  while  its 
space  complexity  is  linear.  However,  on  average  DP  is  relatively  efficient,  while 
DR’s  average  complexity  is  close  to  its  worst-case.  Consequently,  DR  is  signifi¬ 
cantly  less  efficient  than  DP  when  applied  to  uniformly  generated  3-cnfs  having 
large  w* ,  while  outperforming  DP  by  many  orders  of  magnitude  when  applied 
to  theories  with  bounded  w*  [4].  This  time-  and  space-wise  complementary  be¬ 
havior  of  the  two  algorithms  prompted  the  idea  of  combining  DP  and  DR. 

We  propose  a  family  of  hybrid  algorithms,  called  Dynamic  Conditioning  -/* 
DR  (DCDR)^  parameterized  by  a  bound,  6,  that  controls  the  balance  between 
resolution  and  backtracking.  Given  6,  the  algorithm  DCDR(6)  selects  a  subset  of 
conditioning  variables,  or  cutset^  Cj,  such  that  w*  of  the  resulting  (conditional) 
theory  does  not  exceed  6.  The  hybrid  algorithm  searches  the  space  of  truth 
assignments  for  the  conditioning  variables  and  resolves  upon  the  rest  of  the 

*  This  work  was  partially  supported  by  NSF  grant  IRI-9157636,  by  the  Electrical 
Power  Research  Institute  (EPRI)  grant  RP8014-06,  and  by  Rockwell  MICRO  grants 
UCM-20775  and  95-043. 
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variables.  Dividing  the  set  of  variables  into  the  cutset  and  resolution  variables 
is  accomplished  during  run  time,  i.e.  dynamically.  We  have  also  experimented 
with  a  static  version  of  the  algorithm  (for  details  see  the  full  paper  available 
through  http://www.ics.uci.edu/^irinar).  We  show  that  the  time  complexity  of 
both  algorithms  is  0{exp{c  +  5)),  where  c  is  the  largest  cutset  size  encountered 
during  run  time. 

We  tested  DCDR(6)  on  uniform  k-^cnfs  and  on  structured  problems  having 
bounded  it;*,  such  as  (A:,  m)-trees.  A  (k,m)-tree  is  a  tree  of  cliques,  each  having 
A:  -f  m  nodes,  where  k  is  the  size  of  intersection  between  each  two  neighboring 
cliques.  We  observed  three  different  behavior  patterns  depending  on  w*  (see 
Figure  1):  1.  on  problems  having  large  u;*,  such  as  uniform  3-cnfs  around  the  50% 
solvable  crossover  point  (the  transition  region  from  satisfiable  to  unsatisfiable 
problems),  the  time  complexity  of  DCDR(6)  is  similar  to  DP  when  b  is  small 
(obviously,  a  bound  6  =  -1  does  not  allow  any  resolution,  making  DP  equivalent 
to  DCDR(-l)  ),  however,  when  6  increases,  the  CPU  time  for  DCDR(6)  grows 
exponentially;  2.  theories  having  very  small  w*  (such  as  {k,  m)-trees  with  k  < 
4,  m  <  6)  are  easier  for  DCDR(6)  with  a  large  6,  since  DCDR(6)  coincides 
with  DR  for  6  >  ly*;  3.  on  {k,m)~tTees  with  larger  clique  size,  we  observed  an 
intermediate  region  of  6’s  values  yielding  a  faster  algorithm  than  both  DP  and 
DR.  The  averages  for  uniform  3-cnfs  are  computed  on  100  problem  instances, 
while  for  (k,  m)-trees  we  ran  only  25  experiments  per  point.  We  therefore  view 
our  results  as  preliminary.  However,  they  indicate  the  general  promise  of  the 
approach. 

We  see  that  w*  provides  a  reasonable  predic¬ 
tor  of  6.  When  it;*  is  very  large,  we  choose 
b  <  1;  when  it;*  is  very  small  (less  than  4), 
we  choose  large  b;  for  intermediate  levels  of 
It;*  it  is  better  to  choose  a  bounded  level  of  b. 
The  algorithms  having  b  in  the  range  of  5  to 
8  seem  promising,  since  they  behave  similarly 
to  DP  on  uniform  instances,  to  DR  for  small 
w* ,  while  for  intermediate  values  of  w*  they 
exploit  the  benefits  of  both  DP  and  DR.  The 
hybrid  algorithms  trade  space  for  time  [3], 
and  output  a  compiled  theory  from  which  a 
portion  of  the  solution  set  rather  than  one 
solution  can  be  generated  in  linear  time. 
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1  Introduction 

We  introduce  a  method  for  adding  a  limited  form  of  local  control  ioi  cc  programs 
m  a  potentially  distributed  context.  ° 

In  settings  where  large  number  of  agents  interact  through  (possibly  many 
distmct)  shared  stores  it  is  desirable  to  have  a  local  simplification  mechanism 

is  exhausted  disappear.  “Local”  means 
here  that  simplification  in  a  given  store  should  not  be  dependant  on  knowledge 
of  information  located  m  remote  stores,  nor  on  knowledge  of  the  whole  multiset 
of  agents  active  m  the  store.  Rather,  we  envision  the  simplification  mechanism 
as  a  kind  of  reaction  rule,  in  the  spirit  of  Berry  and  Boudol’s  ChAM  “chemical” 
rules[l]  :  a  given  agent  reacts  to  its  perceived,  local  environment  by  reducing 
o  rue.  The  simplification  is  thus  parametered  by  the  scope  of  the  perceived 

to  be  as  local  as  possible,  the  information  that  is  used  for  control  has  to  be 
carried  by  the  agents,  i.e.  integrated  at  the  language  level. 


2  A  Language  Extension  :  cc*’*’’* 

We  define  a  strict  extension  of  the  determinate  cc  languages  by  marking  agents 
or  multisets  of  agents  with  tags.  The  basic  idea  is  the  following  :  an  agfnt  may 

on  ^  interpreted  as  upper  bound/lower  bound 

on  the  information  the  agent  will  eventually  add  to  the  store  in  the  course  of  a 
complete  computation.  The  upper  bound  is  used  to  eliminate  the  agent  when  all 
the  information  It  may  add  to  a  store  is  already  entailed  by  it,  while  the  lower 
bounds  are  used  to  guarantee  that  a  group  of  agents  will  eventually  bring  as 
much  information  as  a  given  agent  and  to  eliminate  the  latter. 

An  adequate  modification  of  the  deterministic  cc  operational  semantics  ensures 
transmission  of  tags  and  simplification. 

Mam/p  /+^(A)  of  an  agent  A  in  context  (prc^ 

gram)  P  is  defined  as  the  upper  bound  along  all  computational  paths,  starLg 

in  c  and  ending  in  the  terminal  store,  of  (the  entailment  closure  of)  the  set  of 

SeTSZbl")  '  ”»“>■ 

These  quantities  are  usually  hard  to  determine  exactly  :  tag  i+  (resp.  i")  is  used 
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to  keep  track  of  approximate  information  on  7+  (resp.  I  ).  This  information 
enables  dynamic  simplification  of  two  types  of  agents  : 

1.  quiescent  agents,  whose  potential  information  is  entailed  by  the  current  store 

2.  redundant  agents,  whose  potential  information  is  entailed  by  the  minimal 
information  of  a  group  of  other  agents  of  the  program 

Tagging  a  cc  program  amounts  to  selecting  a  certain  subset  of  (hopefully  trun¬ 
cated)  computation  paths  of  the  initial  program,  and  therefore  trivially  preserves 

ooxrGctrxGSS 

We  distinguish  two  complementary  approaches  to  program  tagging  :  programs 
may  be  “manually”  tagged  by  a  programmer  using  expert  knowledge  of  the  spe¬ 
cific  application,  or  one  may  rely  on  a  procedure  that  partially  automathizes 
the  tagging  of  a  program  and  preserves  completeness. 

3  Static  Analysis 

To  affix  positive  tags  to  a  program  P,  we  make  a  nonstandard  use  of  the  classical 
adjoint  framework  of  abstract  interpretation  [2].  We  compute  upper  approximar 
tions  of  the  potential  information  of  agents  on  an  abstract  constraint  system  and 
use  these  approximations  to  positively  tag  the  agents  in  a  sound  manner,  i-e.  in 
such  a  way  that  the  tagged  program  terminates  in  the  same  store  as  the  initial 
program.  We  reason  on  a  ‘reverse’  abstract  interpretation  of  P,  i.e.  P  is  syntacti¬ 
cally  transformed  into  a  program  P’  computing  on  an  abstract  constraint  system 
A  that  returns  constraints  stronger  than  the  ones  returned  by  the  concrete  com¬ 
putation.  P’  is  then  executed  and  each  agent  in  P  is  tagged  by  the  information 
produced  by  its  abstract  counterpart  during  the  abstract  computation. 

4  Future  Work 

The  very  nature  of  this  simplification  scheme  makes  it  suitable  for  use  in  a  dis¬ 
tributed  framework.  We  are  developing  a  distributed  cc  execution  model  that 
would  help  assess  local  ‘mobile’  control  techniques  of  the  type  introduced  here 
as  opposed  to  central  control  strategies  (or  central  control  strategies  local  to  a 
given  processor),  and  in  general  local  control  techniques  depending  on  the  ‘de¬ 
gree  of  locality’,  ^  -i  +  j 

We  are  also  designing  more  involved  abstract  interpretation  techniques  that  ad¬ 
dress  the  issue  of  negative  tagging. 
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Constraints  Satisfaction  Problems  (CSPs)  belong  to  the  family  of  NP  com¬ 
plete  problems.  The  complexity  of  finding  one  solution  for  a  CSP  is  lower  or 
equal  to  the  complexity  of  finding  all  the  solutions  of  a  CSP,  this  papers  focus 
on  the  latter  task.  Lower  upper  complexity  bounds  of  solving  CSPs  were  found 
for  CSPs  that  are  represented  as  graphs  of  constraints  with  special  properties 
[1,  3].  This  paper  presents  a  new  method,  for  finding  upper  bounds  on  the  com¬ 
plexity  of  solving  CSPs.  In  many  cases  this  method  achieves  better  bounds  than 
the  other  methods.  An  algorithm  for  finding  all  solutions  of  a  CSP,  and  based 
on  this  method,  is  presented.  Algorithms  previously  developed  for  finding  all  the 
solutions  of  CSPs  ([2],  [4])  do  not  manage  well  loose  CSPs.  We  compare  DAP  to 
them,  and  show  that  DAP  outperforms  them  in  many  cases,  moreover  it  is  the 
only  algorithm  that  has  a  good  behavior  (both  in  time  and  space  requirements) 
for  loose  CSPs. 

The  main  idea  is  based  on  problem  partitioning.  Given  a  CSP  divided  into  m 
groups,  Gi,G2,  with  n,-  nodes,  e,*  internal  constraints,  u,*  largest  domain 

size  of  group  Gi ,  and  c,*  the  number  of  nodes  in  Gi  connected  by  external  con¬ 
straints  to  other  groups,  we  will  find  all  the  solutions  of  the  CSP.  Assume  that 
Si  is  the  group  of  all  the  solutions  of  G,*.  Note  that  we  won’t  require  explicit 
representation  of  all  the  solutions  (because  the  representation  itself  may  be  of 
exponential  size),  we  will  be  content  with  a  solution  in  which  each  5,-  is  divided 
into  Ki  groups  and  in  which  for  any  tuples  (Ari, ..,  kn\ki  G  Ki)  it  is  written  either 
if  all  the  of  solutions  members  in  those  groups  are  solutions  for  the  CSP  or  none 
of  them  is. 

At  the  first  stage,  find  all  the  internal  solutions  for  each  component.  The 
complexity  of  this  process  is  G(X^^ieiv”*),  Then  divide  the  solutions  for  each 
group  into  groups  according  to  the  values  in  the  nodes  that  participate  in  external 
constraints;  the  complexity  of  this  stage  is  of 

After  the  internal  solutions  are  found  and  divided  into  groups  according  to 
the  values  in  all  the  nodes  that  participate  in  external  constraints,  proceed  to 
find  the  groups  of  solutions  that  are  competent.  This  will  be  done  in  the  following 
way:  start  with  an  empty  group  of  groups  of  partial  solutions,  combine  at  each 
step  the  groups  of  solutions  of  another  Gi  with  the  partial  solutions  found  so 
far.  The  combination  of  the  groups  of  solutions  in  group  Gi  with  the  groups  of 
partial  solution  is  done  by  testing  for  each  group  in  Gi  what  groups  of  partial 
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solutions  are  legal.  In  the  next  stage  the  group  of  partial  solutions  will  include 
all  the  legal  combinations  between  the  groups  in  Gi  and  the  previous  groups  of 
partial  solutions.  After  all  the  m  groups  were  combined  with  the  group  of  partial 
solution,  they  will  include  all  the  groups  of  solutions.  The  total  complexity  of 
combination  process  is  ')•  Thus  the  total  complexity  is  0(U7!-,  c,  v^-f 

Er=i(^.+e,>ro(i). 

We  will  now  consider  the  problem  of  finding  all  the  solutions  of  a  CSP. 
Finding  all  the  solutions  is  of  importance  when  trying  to  find  an  optimal  solution 
for  a  CSP,  then  all  the  solutions  can  be  generated  and  tested  one  by  one^. 
Traditionally  two  approaches  have  been  used  for  finding  all  the  solutions  of  a  CSP 
synthesis  algorithms,  and  exhaustibly  applying  the  backtrack  based  algorithms 
used  for  finding  one  solution  until  all  the  search  space  is  traversed  (see  [5]). 
As  Tsang  [5]  pointed  out  both  the  synthesis  methods  and  the  backtrack  based 
methods  can  be  an  acceptable  selection  when  the  CSP  is  tight,  however  when 
the  CSP  is  loose  then  none  of  them  is  suitable  for  the  task.  The  DAP  algorithm 
is  aimed  to  handle  the  loose  case.  The  DAP  algorithm  is  strongly  based  on  the 
method  for  evaluating  complexity  previously  presented,  basically:  First,  part  the 
vertices  of  the  CSP  using  a  hill  climbing  based  algorithm  into  disjoint  sets  trying 
to  achieve  a  low  value  for  the  formula  (1).  Afterwards,  find  all  the  solutions  of 
the  CSP  using  the  partition  previously  found  and  applying  the  same  method  we 
used  for  evaluating  the  complexity. 

The  behavior  of  the  DAP  algorithm  was  tested  and  compared  using  a  set 
of  random  CSPs  with  10  variables  and  6  values.  For  sparse  CSPs,  the  DAP 
algorithm  outperforms  the  backtrack  based  algorithms  by  an  order  of  2  to  3 
magnitudes.  For  tight  CSPs,  the  partition  in  DAP  yields  just  one  group,  there¬ 
fore  DAP  and  the  backtrack  based  algorithms  perform  similarly.  Note  that  an 
empirical  comparison  with  the  synthesis  algorithms  was  impractical  due  to  their 
huge  space  requirements.  Similarly,  increasing  the  variables  or  domain  size  of 
the  CSPs  make  the  backtrack  based  algorithms  completely  impractical. 
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Constraint  satisfaction  problems  (CSP)  are  part  of  many  real  world  domains, 
such  as  computer  vision  and  scheduling  problems.  Often,  CSPs  are  solved  in  real 
life  by  several  agents,  each  of  them  working  on  a  part  of  the  problem  [3,  4].  A 
distributed  CSP  can  be  viewed  as  a  set  of  constraint  networks(CN),  each  CN 
being  solved  by  a  different  agent,  where  the  CNs  are  connected  by  constraints. 
A  major  assumption  of  the  present  paper  is  that  checking  constraints  inside  the 
distributed  components  has  a  much  lower  cost  than  checking  constraints  across 
different  components.  The  latter  check  involves  some  kind  of  message  passing 
that  the  solving  algorithm  would  like  to  minimize. 

The  processing  of  CNs  have  been  studied  extensively  in  the  last  decade  [1,2], 
usually  within  the  standard  model  which  is  sequential.  Several  attempts  have 
been  made  at  studying  the  processing  of  CNs  in  parallel  The  most  relevant 
study  of  distributed  CSPs  has  been  made  by  Yokoo  [5].  The  basic  difference 
between  our  approach  and  Yokoo’s  approach  is  that  our  algorithms  try  to  take 
advantage  of  the  differences  between  the  DCSPs  components. 

The  model  of  a  DCSP  of  the  present  paper  uses  agents  that  are  connected 
by  a  communication  network  (i.e.,  no  common  memory,  just  message  passing). 
The  number  of  agents  is  equal  or  larger  by  a  small  constant,  to  the  number 
of  subproblems  in  the  given  division  of  the  DCSP.  Based  on  this  we  state  the 
following  goals  for  our  multi- agent  algorithms: 

-  Try  to  optimize  the  performance  of  the  slowest  agent,  rather  than  optimizing 
each  individual  agent. 

-  Minimize  the  amount  of  backtracking  each  agent  performs  as  a  result  of 
actions  of  other  agents. 

In  general,  a  DCSP  may  be  represented  in  two  ways.  The  Explicit  represen¬ 
tation  is  the  original  one,  where  variables  in  one  component  may  be  connected 
by  a  constrains  to  any  other  variable  in  the  same  or  in  different  component. 
In  the  Canonical  representation,  a  new,  central  component  is  added.  This  com¬ 
ponent  contains  copies  of  all  variables  which  are  connected  by  inter-component 
constraints,  such  that  solving  the  CSP  of  this  central  component  guarantees  that 
all  global  constraints  are  satisfied.  The  equivalence  of  the  two  representations 
can  be  shown  easily. 

Four  algorithms  for  solving  DCSPs,  that  are  sound  and  complete  are  pro¬ 
posed.  Two  of  these  algorithms  are  sequential  and  two  algorithms  operate  in 
parallel  and  are  inherently  distributed.  They  can  be  summarized  as  follows: 
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1.  Algorithm  1.  Look  at  the  DCSP  as  a  regular  CSP  and  solve  it  by  one  of  the 
commonly  known  techniques  disregarding  the  distribution  to  components. 
Note  that  at  most  one  agent  is  working  at  any  time. 

2.  Algorithm  2.  Same  as  1,  except  that  agents  first  work  on  their  own  sub¬ 
problem. 

3.  Algorithm  3.  The  first  agent  find  first  a  solution  to  the  central  component. 
It  then  broadcasts  this  solution  to  all  other  agents.  These  agents  search  for 
solutions  to  their  corresponding  sub-problems  in  parallel.  If  all  of  them  find  a 
consistent  solution  to  their  sub-problems  we  are  done,  else  the  central  agent 
must  backtrack  and  broadcasts  a  new  solution  to  the  central  component. 

4.  Algorithm  4.  Here,  the  peripheral  agents  search  for  solutions  in  parallel, 
and  send  their  solution  to  the  central  agent.  If  the  central  agent  can  find  a 
consistent  solution  we  are  done,  otherwise,  the  first  agent  that  caused  the 
failure  is  asked  to  backtrack,  and  send  its  new  solution  back  to  the  central 
agent.  The  backtracking  is  done  sequentially  to  assure  completeness. 

Algorithms  3  and  4  were  designed  for  two  opposite  cases  of  DCSPs:  a  dominant 
central  component  seems  natural  for  algorithm  4,  while  dominant  peripheral 
components  calls  for  algorithm  3. 

The  behavior  of  the  proposed  algorithms  was  tested  by  generating  and  solv¬ 
ing  a  set  of  random  DCSPs.  The  main  parameters  which  were  changed  in  the 
experiment  were  the  number  and  tightness  of  internal  and  external  constraints. 
The  two  main  measures  that  were  measured  were  the  number  of  constraint  checks 
performed,  and  the  number  of  messages  passed.  The  latter  measure  is  particu¬ 
larly  important  in  a  distributed  environment.  The  results  show  Algorithm  2  to 
be  the  worst.  Algorithms  3  and  4  are  much  better  when  there  aregreat  differ¬ 
ences  between  the  tightness  of  local  vs.  external  constraints,  while  Algorithm 
1  is  good  only  when  this  tightness  is  equal  for  all  constraints  (its  really  does 
not  pay  to  decompose  the  problem!).  As  expected,  Algorithm  3  is  much  better 
when  the  central  component  is  tighter,  while  Algorithm  4  is  better  when  the 
peripheral  components  are  tighter.  In  the  future  we  plan  to  use  these  results  in 
solving  real-life  distributed  resource  allocation  problems  such  as  in  [4]. 
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This  note  describes  one  aspect  of  our  exploration  of  the  machine  synthe¬ 
sis  of  scheduling  algorithms  [1,  2],  The  approach  involves  several  stages.  The 
first  step  is  to  develop  a  formal  model  of  the  scheduling  domain,  called  a  do- 
main  theory.  Second,  the  constraints,  objectives,  and  preferences  of  a  particu¬ 
lar  scheduling  problem  are  formally  stated  within  the  language  of  the  domain 
theory  as  a  problem  specification.  Finally,  an  executable  scheduler  is  produced 
semi-automatically  by  applying  a  sequence  of  transformations  to  the  problem 
specification.  The  transformations  embody  programming  knowledge  about  al¬ 
gorithms,  data  structures,  program  optimization  techniques,  etc.  The  result  of 
the  transformation  process  is  executable  code  that  is  correct  by  construction. 
Furthermore,  the  resulting  code  can  be  extremely  efficient. 

In  this  note  we  focus  on  scheduling  a  class  of  resources  that  we  call  Asyn¬ 
chronously  Shared  Resources  (ASRs)  (see  also  [3]).  An  ASR  can  be  shared  si¬ 
multaneously  by  many  users  or  tasks  whose  usage  patterns  are  not  necessarily 
synchronized.  The  resource  is  assumed  to  have  finite  capacity  and  the  tasks  are 
assumed  to  use  the  resource  for  finite  periods.  A  typical  example  of  an  ASR 
is  an  automobile  parking  lot  having  n  parking  slots.  Users  can  come  and  go 
independently,  but  a  scheduler  should  never  assign  more  than  n  users  to  the 
parking  lot  at  the  same  time.  More  generally  any  pool  of  individual  resources 
can  be  treated  as  an  ASR:  ramp  space  at  airports,  machining  tools  in  manufac¬ 
turing,  computer  processors  running  in  parallel,  fleets  of  transportation  vehicles, 
personnel  in  a  skill  pool,  etc.  Power  sources  (e.g.  generators,  batteries)  provide 
nondiscrete  examples  of  ASRs.  ASR  scheduling  can  be  seen  as  a  special  case 
of  multi-capacitated  job-shop  scheduling  where  there  is  one  multi-capacity  ma¬ 
chine. 

The  motivation  for  this  work  was  to  ease  the  burden  of  creating  domain 
theories,  by  building  a  library  of  abstract  theories  for  various  classes  of  resources. 
Included  in  each  theory  would  be  various  axioms,  lemmas,  and  theorems  that 
facilitate  the  inference  (at  design  time)  of  constraints  for  efficient  propagation. 

Suppose  we  are  given  a  set  of  tasks  where  each  task  tsk  has  an  earliest  start 
time,  latest  start  time,  duration  dur(tsk),  and  demand  demand(tsk).  The  problem 
of  scheduling  an  ASR  can  be  defined  as  follows:  given  a  set  T  of  tasks  and  an 
ASR  with  capacity  c,  find  an  assignment  of  start  times  st(tsk)  to  each  task  that 
satisfies  the  ASR  capacity  constraint:  at  no  time  does  the  demand  on  the  ASR 
exceed  its  capacity;  formally  V(t :  time)  demand{T,  t)  <c  where 
demand{T,  t)  =  ^  demand{tsk) 

tj,k£T 

st{tsk)<t<st{tsk)+dur{tsk) 

computes  the  aggregate  or  net  demand  of  the  tasks  in  T  at  time  t. 
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The  objective  of  this  study  was  to  work  out  how  to  synthesize  a  variety  of 
algorithms  for  ASR  scheduling.  Our  intent  was  more  to  represent  the  design 
knowledge  necessary  for  deriving  the  best  possible  scheduling  algorithms,  rather 
than  to  derive  new  algorithms. 

Two  classes  of  algorithms  were  derived  which  we  call  discrete  and  aggregate. 
A  discrete  algorithm  assumes  that  the  capacity  bound  on  the  ASR  is  integral 
and  that  each  task  consumes  one  unit  of  capacity.  The  data  structures  and 
constraint  propagation  techniques  are  well-known  from  the  literature  on  unit- 
capacity  machine  scheduling. 

In  aggregate  algorithms  we  treat  the  scheduling  of  an  ASR  as  a  whole,  as¬ 
suming  no  internal  structure  to  the  capacity  of  an  ASR.  The  key  idea  here  is 
to  maintain  (via  finite  differencing  [1])  data  structures  that  represent  lower  and 
upper  bounds  on  demand{T^  t)  for  all  called  respectively  the  definite  and 
possible  demand  maps.  The  main  disjunctive  constraint  that  we  derived  states 
that  whenever  a  block  of  tasks  has  definitely  reserved  the  ASR  at  some  time  (i.e. 
no  other  task  could  feasibly  be  executed  at  that  time),  then  any  other  task  must 
either  precede  or  succeed  some  task  in  the  block.  There  are  various  strategies 
for  choosing  how  to  apply  this  constraint.  The  possible  demand  map  is  used  to 
drive  branching  at  potentially  oversubscribed  times. 

Our  overall  experience  with  these  algorithms  is  that  they  will  either  find 
a  feasible  schedule  quickly  or  else  take  a  very  long  time  to  complete.  Finding 
a  schedule  quickly  means  that  little  or  no  backing  up  occurs  during  search  - 
mainly  descendants  and  siblings  are  ever  explored.  The  aggregate  algorithms 
can  be  tuned  either  to  find  better  solutions  or  else  to  solve  harder  problems, 
although  they  also  tend  to  be  somewhat  slower  than  the  discrete  algorithms  due 
to  the  expense  of  maintaining  the  demand  maps  and  the  extra  complexity  of  the 
deciding  how  to  branch. 

This  work  is  intended  to  bring  the  goal  of  machine  support  for  synthesizing 
customized  high-performance  scheduling  algorithms  one  step  closer  to  practical 
realization. 
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The  Generalized  Railroad  Crossing: 
Its  Symbolic  Analysis  in  CLP  {It) 


Luis  Urbina 


Abstract.  The  symbolic  simulation  and  analysis  method  for  hybrid  systems 
presented  in  [2]  is  illustrated  by  means  of  the  benchmark  problem  ’’The 
Generalized  Railroad  Crossing”  [1]. 


1  The  Generalized  Railroad  Crossing  (GRC) 

The  GRC  can  be  described  as  follows:  The  system  operates  a  gate  at  a  railroad 
crossing.  The  railroad  crossing  intersection  lies  in  a  region  of  interest  region.  A  set 
of  trains  travel  through  region  on  multiple  tracks  in  both  directions,  A  sensor  system 
determines  when  each  train  enters  and  exits  region.  We  define  the  time-dependent 
gate  function  g{t)  £  [0,90],  where  g{i)  =  0  means  the  gate  is  down  and  g{t)  =  90 
means  the  gate  is  up,  and  the  set  {[^i,  i/,]}  of  occupancy  intervals  [/i,-,  i/,],  where  each 
occupancy  interval  [/i,-,  I'i]  is  a  time  interval  during  which  one  or  more  trains  are  in 
intersection,  pi  is  the  time  of  the  z-th  entry  of  a  train  into  the  crossing  when  no  other 
train  is  in  the  crossing  and  i/,-  is  the  first  time  since  pi  that  no  train  is  in  the  crossing. 
Given  two  constants  7i  >  0  and  72  >  0,  develop  a  system  to  operate  the  gate  in 
such  way  that  it  satisfies  the  following  two  properties:  Safety  Property:  ’The  gate 
is  down  during  all  occupancy  intervals.’  V  i  G  ^  9{^)  —  0-  Obviously,  this 

property  can  be  easily  verified  by  blocking  the  crossing.  Thus,  a  stronger  property  is 
given,  which  does  not  permit  the  realization  of  simple  designs  of  the  problem.  Utility 
Property:  ’The  gate  is  up  when  no  train  is  in  the  crossing.’  V  t  ^  \Ji[pi-^i,  1^1+72] 

“  90.  This  further  requirement  ensures  that,  at  72  time  after  the  start  of  a  non¬ 
occupancy  interval,  the  gate  must  be  open,  g{t)  =  90,  and  remains  open  until  71 
time  before  the  end  of  that  interval,  i.e.,  71  time  before  the  beginning  of  the  next 
occupancy  interval.  The  GRC  is  modeled  as  the  product  of  m  4*  1  linear  hybrid 
systems,  i.e.,  GRC  —  RAILi  x  ...  x  RAILm  x  GATE  {m  >  1).  Each  of 
them  is  firstly  modeled  by  a  linear  hybrid  automaton  and  then  transformed  into  a 
CLP (7^)  program.  The  product  is  a  CLP (71)  program  as  well. 


Rail.  The  rail  component  RAIL,-  is  shown  in  Figure  1  a).  Cj  is  the  clock  for  RAIL,-. 
k  IS  a.  counter  variable  for  the  number  of  trains  being  inside  the  railroad  crossing 
intersection,  enteri  and  exiU  are  the  sensor  signals  when  a  train  enters  region  and 
when  it  leaves  intersection,  respectively,  ai  is  the  minimal  time  that  must  occur 
between  the  exit  of  a  train  from  intersection  and  the  entry  of  the  next  train  into 
region.  Cii  ('7*2)  is  the  lower  (upper)  bound  of  the  time  that  a  train  takes  from  region 
to  intersection,  th  (TV2)  is  the  lower  (upper)  bound  of  the  time  that  a  train  takes 
to  leave  intersection.  Some  restrictions:  cr,-  >  0,  0  <  ^,1  <  ^,2  and  0  <  th  <  r,-2. 
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Fig.  1.  a)  RAIL.-  and  b)  GATE 


Gate.  The  gate  component  GATE  is  shown  in  Figure  1  b).  ^  is  the  gate  function. 
vi  {v2)  is  the  lower  (upper)  bound  of  the  rate  to  lower  the  gate  completely. 

We  have  revised  both  requirements.  The  locations  far\  region^  and  intersection^ 
are  the  corresponding  locations  for  RAIL,-.  Safety  Property:  ’Whenever  the  train  is 
in  the  railroad  intersection  the  gate  is  down.’  -^{intersection^ ^  -Ainter section”^  A 
-^down).  The  GRC  satisfies  the  safety  requirement  provided  it  never  reaches  a  state 
where  a  train  on  RAIL,-  is  in  intersection'  and  the  GATE  is  not  in  down.  We  intro¬ 
duce  an  extra  clock  x,-  which  serves  to  count  the  delay  time  since  a  train  on  RAIL, 
has  left  intersection.  Thus,  in  the  modified  hybrid  system  RAIL’,-  Xi  is  reset  in 
the  transition  from  intersection  to  far.  Utility  Property:  ’Whenever  no  train  is  in 
the  railroad  intersection  the  gate  is  up.’  A/ar*)  V  {down  A 

far')  V  {going. up  A  far')  V  {going  jiown  A  region^)  V  {down  A  region')  V  {going. up  A 
region  )]  A  (ci  <  C12  —  7i  A  •  •  •  A  <  ^m2  —  71  A  Xi  >  72  A  •  •  •  A  >  72)]- 


2  Symbolic  Analysis 

Example  1  The  GRC  with  one  rail.  We  set  <ti  =  =  4,  C12  =  6,  rn  =  4,  ri2  = 

=  1  and  V2  =  2.  The  GRC  violates  the  safety  property. 

Example  2  The  GRC  with  two  rails.  We  set  =  8,  =  r,-i  =  4,  c,-2  =  7t2  = 

6,  ui  =  30,  and  V2  =  40.  The  correctness  of  the  safety  property  can  be  seen  by 
symbolic  simulation.  It  can  also  be  proved  by  bottom-up  evaluation  of  the  CLP (7^) 
program.  This  implies  to  start  the  bottom-up  evaluation  with  the  set  of  facts: 
^  =  {{-,-, going. up)  f-  x  >  10.,  (_, ..up)  <r-  x  >  10.,  {_, going jdown)  f-  x  >  10.}. 
The  GRC  meets  the  safety  property  iff  the  initial  goal  {far,  far,  up)  A  Ci  =  0  A  C2  = 
0Afc  =  0Ap  =  90A  time  =  0  (corresponding  to  the  initial  fact)  fails  in  the  fixpoint 
B  of  the  execution  A  5.  Verification  of  the  utility  property  for  71  =  5 

and  72  =  4  for  the  modified  GRC  is  made  as  for  the  correctness  proof  of  the  safety 
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property.  A  train  takes  from  4  to  6  time  units  to  reach  intersection  and  the  gate 
takes  up  to  3  time  units  to  lower.  Note  that  7i>6  —  4-|-3  =  5  because  the  moment 
a  train  enters  region  is  the  moment  the  gate  begins  to  lower.  Since  the  gate  takes 
up  to  3  time  units  for  going  down  after  the  last  train  has  left  intersection^  72  ^  3* 
This  is  proved  by  bottom-up  evaluation  starting  with  the  set  of  final  facts  gener¬ 
ated  by  the  utility  property  (for  instance,  {far,  going  Mown)  ^  ci  ^  - 

lAa^i  >  4Ax2  >  4  is  a  subset  of  such  facts)  and  checking  that  the  initial  goal 
{far,  far,  up)  A  0Ac2  =  0A^*  =  0Aflf  =  90A  time  -  0.  fails  for  the  fixpoint  of 
the  bottom~up  evaluation.  Fortunately,  the  executions  above  terminate. 
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Traditionally,  constraint  satisfaction  problems  (CSP’s)  [1]  are  so  defined  that 
''all  the  constraints  are  satisfied  simultaneously y  However,  this  is  not  always 
true.  Many  CSP’s  in  real-life  are  "soft  CSP’s,”  ie.,  an  assignment  of  values  to 
the  variables  is  considered  to  be  a  solution  even  if  some  constraints  are  violated. 
Some  of  the  practical  CSP’s  are  fuzzy:  they  are  fully  satisfied  by  some  value 
assignments  to  the  variables  in  the  constraint,  and  they  are  considered  to  be 
“partially”  or  “less”  satisfied,  instead  of  “violated,”  by  some  other  assignments. 
Sometimes  a  real-life  CSP  may  consist  of  a  mixture  of  hard  constraints  and  soft 
constraints.  In  these  cases  we  are  required  to  find  assignments  that  fully  satisfy 
the  hard  constraints  and  fully  or  partially  satisfy  the  soft  constraints. 

A  constraint  satisfaction  problem  is  defined  as  a  tuple  {Z,D,C^).  Z  is  a 
finite  set  of  variables  and  U  is  a  finite  set  of  domains  one  associated  with  each 
variable  in  Z.  is  a  set  of  constraints.  Each  constraint  is  a  crisp  relation 
among  the  domains  of  a  subset  of  the  variables  in  Z.  Each  constraint  restricts 
the  combination  of  values  that  these  variables  can  take.  The  goal  of  a  CSP  is  to 
find  a  consistent  assignment  of  values  to  the  variables  in  Z  that  satisfies  all  the 
constraints  in  A  fuzzy  constraint  satisfaction  problem  (FCSP)  is  defined  as  a 
tuple  {Z^D,C^).  is  a  set  oi  fuzzy  constraints.  Each  fuzzy  constraint  is  a  fuzzy 
relation  among  the  domains  of  a  subset  of  the  variables  in  Z.  Satisfaction  index 
of  a  fuzzy  constraint  tells  us  to  what  extent  a  constraint  is  satisfied.  Solution 
index  of  an  FCSP  {Z,D,C^)  shows  its  overall  satisfaction.  It  is  based  on  the 
satisfaction  indexes  of  all  the  constraints  in  and  obtained  by  a  user-defined 
function  called  satisfaction  function.  Threshold  is  a  user-defined  lower  bound  of 
the  acceptable  solution  index  of  an  FCSP.  The  goal  of  an  FCSP  (Z,  D,  is  to 
find  an  assignment  of  values  to  all  variables  in  Z  so  that  the  solution  index  is 
not  less  than  the  threshold.  The  difference  between  FCSP  and  CSP  lies  on  the 
set  of  constraints  they  involve.  For  a  CSP  (Z,D,C^),  the  constraints  in  are 
Boolean.  An  assignment  of  a  tuple  to  the  variables  in  return  true  (1)  or  false 
(0).  For  an  FCSP  {Z,D,C^),  the  constraints  in  has  a  range  of  return  values 
from  0  to  1.  Obviously,  CSP  is  a  restricted  instance  of  FCSP.  The  domain  of 
problems  that  CSP  can  model  is  a  subset  of  the  problems  that  FCSP  can  model. 

A  generic  neural  network  model  called  GENET  has  been  proposed  by  Tsang 
and  Wang  [2]  for  solving  CSP’s  with  binary  constraints.  GENET  solves  CSP’s 
by  iterative  improvement  and  incorporates  a  learning  strategy  to  escape  local 
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minima.  Lee,  Leung  and  Won  [3]  later  propose  E-GENET,  an  extended  GENET 
to  solve  non-binary  CSP’s.  We  have  also  developed  a  model  called  fuzzy  GENET 
based  on  GENET  for  solving  binary  FCSP.  In  this  paper,  we  propose  fuzzy  E- 
GENET  by  merging  the  idea  of  E-GENET  and  fuzzy  GENET.  Fuzzy  E-GENET 
allows  the  representation  of  nonbinary  FCSP’s. 

We  have  built  a  fuzzy  E-GENET  simulator.  Benchmarking  results  show  that 
fuzzy  E-GENET  is  as  efficient  as  GENET  in  solving  non-fuzzy  problems,  both 
binary  and  non-binary.  We  have  also  use  the  N  x  (W  —  l)-queens  problem,  a  fuzzy 
binary  problem,  to  test  our  simulator.  In  this  problem,  N  queens  are  placed  on  an 
Nx{N-1)  chessboard  so  that  there  exists  at  least  one  pair  of  queens  attacking 
each  other.  We  define  that  it  is  better  for  two  queens  attacking  each  other  to  be 
separated  by  a  greater  vertical  distance.  The  result  of  fuzzy  E-GENET  running 
on  a  SPARCstation  10  on  the  N  x  {N  -  l)-queens  problem  are  shown  in  table  1. 


Table  1.  Results  on  AT  x  (iV  -  1)  queens  problem 


threshold 

0.9 

0.8 

20  X  19  queens 

02^ 

[jypH 

fnBiag 

30  X  29  queens 

2.63s 

0^ 

40  X  39  queens 

0.88s 

[ilERISI 

50  X  49  queens 

6.22s 

0.77s 

60  X  59  queens 

9.28s 

70  X  69  queens 

16.91s 

2.49s 
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Branch-and-Price  for  Solving  Integer  Programs 
with  a  Huge  Number  of  Variables: 
Methods  and  Applications 


George  L.  Nemhauser 
Georgia  Institute  of  Technology 
Abstract 

Many  interesting  discrete  optimization  problems,  including  airline  crew  scheduT 
ing,  vehicle  routing  and  cutting  stock,  have  “good”  integer  programming  for¬ 
mulations  that  require  a  huge  number  of  variables.  Good  means  that  the  linear 
programming  relaxations  give  tight  bounds  which  implies  small  search  trees.  We 
will  discuss  classes  of  problems  for  which  this  type  of  formulation  is  desirable, 
and  special  (non-standard)  methodology  that  is  needed  to  solve  these  integer 
programs. 
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Constraint  Databases 


Dina  Q.  Goldin 
Brown  University 

Abstract 

Paris  Kanellakis’s  pioneering  paper  in  1990  provided  a  framework  for  constraint 
databases  by  combining  concepts  from  constraint  logic  programming  and  rela¬ 
tional  databases.  The  principal  idea  is  to  generalize  a  tuple  (or  record)  data 
type  to  a  conjunction  of  constraints  from  an  appropriate  language;  for  example, 
order  constraints  or  linear  arithmetic  constraints.  Such  a  tuple  can  be  seen  as 
representing  a  large,  possibly  even  infinite,  set  of  points  in  a  compact  way  (e.g., 
for  spatial  databases  and  GIS).  Constraint  databases  have  since  become  a  very 
active  area  of  database  research. 

After  a  brief  introduction  to  relational  databases,  we  explain  the  semantics 
of  constraint  database  relations  and  queries,  providing  complexity  results  for 
several  specific  constraint  classes.  We  consider  the  various  relational  querying 
paradigms  (declarative,  procedural,  and  logic  programming)  and  their  reinter¬ 
pretation  in  the  presence  of  constraints  as  first-class  data.  We  highlight  the  basic 
design  principles  for  constraint  databases,  such  as  query  closure  and  safety,  effi¬ 
ciency  of  data  representation  and  data  access,  and  query  optimization. 

We  discuss  Paris  Kanellakis’s  more  recent  work,  including  work  on  indexing, 
and  constraint  query  algebras,  and  survey  other  developments  in  the  area  (ag¬ 
gregation,  complex  objects,  expressive  power).  The  ultimate  goal  of  Paris  Kanel¬ 
lakis’s  research  was  to  enable  commercial-quality  implementations  of  constraint 
databases.  We  look  at  some  possible  applications  for  constraint  databases,  and 
at  the  implementational  efforts  currently  under  way.  We  conclude  by  considering 
the  issues  and  the  challenges  that  lay  ahead. 
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Complexity- Theoretic  Aspects 
of  Programming  Language  Design 


Harry  G.  Mairson 
Brandeis  University 
Abstract 

We  survey  three  of  Paris  Kanellakis’s  contributions  to  the  complexity-theoretic 
analysis  of  constructs  in  the  design  of  programming  languages.  These  are  (1) 
his  result  that  first-order  unification  is  complete  for  polynomial  time;  (2)  his 
contributions  to  the  complexity  analysis  of  type  inference  for  polymorphically- 
typed  functional  programming  languages,  which  was  proven  to  be  complete  for 
exponential  time;  and  (3)  his  work  on  expressibility  of  simply  typed  lambda  cal¬ 
culus  when  used  as  a  functional  database  programming  language.  These  research 
investigations  are  interrelated,  emphasizing  common  themes  and  difficulties  in 
the  design  of  programming  languages,  with  concrete  implications  that  can  be 
appreciated  by  language  designers.  First-order  unification  is  a  ubiquitous  build¬ 
ing  block  in  implementations  of  sophisticated  programming  languages.  It  is,  for 
example,  the  workhorse  of  logic  programming  engines,  and  the  essential  com¬ 
ponent  of  compile- time  type  analysis.  The  research  on  unification  provided  a 
secure  foundation  for  understanding  the  complexity  of  automatic  polymorphic 
type  inference  in  functional  languages  such  as  ML  and  Haskell.  Modern  func¬ 
tional  languages  are  based  on  the  typed  lambda  calculus,  so  it  is  then  natural  to 
consider  its  computational  expressiveness.  We  describe  how  the  degree  of  higher- 
order  functionality  in  simply  typed  terms  can  be  related  directly  to  well  known 
complexity  classes. 
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